Release Notes
The following are the contents of the RELEASE_NOTES file as distributed with the Slurm source code for this release. Please refer to the NEWS include alongside the source as well for more detailed descriptions of the associated changes, and for bugs fixed within each maintenance release.
RELEASE NOTES FOR SLURM VERSION 21.08 IMPORTANT NOTES: If using the slurmdbd (Slurm DataBase Daemon) you must update this first. NOTE: If using a backup DBD you must start the primary first to do any database conversion, the backup will not start until this has happened. The 21.08 slurmdbd will work with Slurm daemons of version 20.02 and above. You will not need to update all clusters at the same time, but it is very important to update slurmdbd first and having it running before updating any other clusters making use of it. Slurm can be upgraded from version 20.02 or 20.11 to version 21.08 without loss of jobs or other state information. Upgrading directly from an earlier version of Slurm will result in loss of state information. All SPANK plugins must be recompiled when upgrading from any Slurm version prior to 21.08. NOTE: If upgrading from < 20.11. In the database the AllocGres and ReqGres columns have been removed as they are duplicates for those using AccountingStorageTRES. If you use these columns please make sure to backup this information since it will be lost and set up AccountingStorageTRES with your GRES to be able to get equivalent information out of AllocTRES and ReqTRES after upgrade. NOTE: PMIx v1.1.4 and below are no longer supported. HIGHLIGHTS ========== -- Removed gres/mic plugin used to support Xeon Phi coprocessors. -- Add LimitFactor to the QOS. A float that is factored into an associations GrpTRES limits. For example, if the LimitFactor is 2, then an association with a GrpTRES of 30 CPUs, would be allowed to allocate 60 CPUs when running under this QOS. -- A job's next_step_id counter now resets to 0 after being requeued. Previously, the step id's would continue from the job's last run. -- API change: Removed slurm_kill_job_msg and modified the function signature for slurm_kill_job2. slurm_kill_job2 should be used instead of slurm_kill_job_msg. -- AccountingStoreFlags=job_script allows you to store the job's batch script. -- AccountingStoreFlags=job_env allows you to store the job's env vars. -- configure: the --with option handling has been made consistent across the various optional libraries. Specifying --with-foo=/path/to/foo will only check that directory for the applicable library (rather than, in some cases, falling back to the default directories), and will always error the build if the library is not found (instead of a mix of error messages and non- fatal warning messages). -- configure: replace --with-rmsi_dir option with proper handling for --with-rsmi=dir. -- Removed sched/hold plugin. -- cli_filter/lua, jobcomp/lua, job_submit/lua now load their scripts from the same directory as the slurm.conf file (and thus now will respect changes to the SLURM_CONF environment variable). -- SPANK - call slurm_spank_init if defined without slurm_spank_slurmd_exit in slurmd context. -- Add new 'PLANNED' state to a node to represent when the backfill scheduler has it planned to be used in the future instead of showing as 'IDLE'. sreport also has changed it's cluster utilization report column name from 'Reserved' to 'Planned' to match this nomenclature. -- Put node into "INVAL" state upon registering with an invalid node configuration. Node must register with a valid configuration to continue. -- Remove SLURM_DIST_LLLP environment variable in favor of just SLURM_DISTRIBUTION. -- Make --cpu-bind=threads default for --threads-per-core -- cli and env can override. -- slurmd - allow multiple comma-separated controllers to be specified in configless mode with --conf-server -- configure - add --with-systemdsystemunitdir option to automatically install the Slurm unit files. -- Manually powering down of nodes with scontrol now ignores SuspendExc. -- Distinguish queued reboot requests (REBOOT@) from issued reboots (REBOOT^). -- auth/jwt - add support for RS256 tokens. Also permit the username in the 'username' field in addition to the 'sun' (Slurm UserName) field. -- service files - change dependency to network-online rather than just network to ensure DNS and other services are available. -- Add "Extra" field to node to store extra information other than a comment. -- Add ResumeTimeout, SuspendTimeout and SuspendTime to Partitions. -- The memory.force_empty parameters is no longer set by jobacct_gather/cgroup when deleting the cgroup. This previously caused a significant delay (~2s) when terminating a job, and is not believed to have provided any perceivable benefit. However, this may lead to slightly higher reported kernel mem page cache usage since the kernel cgroup memory is no longer freed immediately. -- TaskPluginParam=verbose is now treated as a default. Previously it would be applied regardless of the job specifying a --cpu-bind. -- Add "node_reg_mem_percent" SlurmctldParameter to define percentage of memory nodes are allowed to register with. -- Define and separate node power state transitions. Previously a powering down node was in both states, POWERING_OFF and POWERED_OFF. These are now separated. e.g. IDLE+POWERED_OFF (IDLE~) -> IDLE+POWERING_UP (IDLE#) - Manual power up or allocation -> IDLE -> IDLE+POWER_DOWN (IDLE!) - Node waiting for power down -> IDLE+POWERING_DOWN (IDLE%) - Node powering down -> IDLE+POWERED_OFF (IDLE~) - Powered off -- Some node state flag names have changed. These would be noticeable for example if using a state flag to filter nodes with sinfo. e.g. POWER_UP -> POWERING_UP POWER_DOWN -> POWERED_DOWN POWER_DOWN now represents a node pending power down -- Create a new process called slurmscriptd which runs PrologSlurmctld and EpilogSlurmctld. This avoids fork() calls from slurmctld, and can avoid performance issues if the slurmctld has a large memory footprint. -- Pass JSON of job to node mappings to ResumeProgram. -- QOS accrue limits only apply to the job QOS, not partition QOS. -- Any return code from SPANK plugin or SPANK function that is not SLURM_SUCCESS (zero) will be considered to be an error. Previously, only negative return codes were considered an error. -- Add support for automatically detecting and broadcasting executable shared object dependencies for sbcast and srun --bcast. -- All SPANK error codes now start at 3000. Where previously SPANK would give a return code of 1, it will now return 3000. This change will break ABI compatibility with SPANK plugins compiled against older version of Slurm. -- SPANK plugins are now required to match the current Slurm release, and must be recompiled for each new Slurm major release. (They do not need to be recompiled when upgrading between maintenance releases.) -- SLURM_NODE_ALIASES now has brackets around the node's address to be able to distinguish IPv6 addresses. e.g. :[ ]: -- The job_container/tmpfs plugin now requires PrologFlags=contain to be set in slurm.conf. -- Limit max_script_size to 512 MB. CONFIGURATION FILE CHANGES (see man appropriate man page for details) ===================================================================== -- Errors detected in the parser handlers due to invalid configurations are now propagated and can lead to fatal (and thus exit) the calling process. -- Enforce a valid configuration for AccountingStorageEnforce in slurm.conf. If the configuration is invalid, then an error message will be printed and the command or daemon (including slurmctld) will not run. -- Removed AccountingStoreJobComment option. Please update your config to use AccountingStoreFlags=job_comment instead. -- Removed DefaultStorage{Host,Loc,Pass,Port,Type,User} options. -- Removed CacheGroups, CheckpointType, JobCheckpointDir, MemLimitEnforce, SchedulerPort, SchedulerRootFilter options. -- Added Script DebugFlags for debugging slurmscriptd (the process that runs slurmctld scripts such as PrologSlurmctld and EpilogSlurmctld). -- Rename SbcastParameters to BcastParameters. -- systemd service files - add new '-s' option to each daemon which will change the working directory even with the -D option. (Ensures any core files are placed in an accessible location, rather than /.) -- Added BcastParameters=send_libs and BcastExclude options. -- Remove the (incomplete) burst_buffer/generic plugin. -- Make SelectTypeParameters=CR_Core_Memory default for cons_tres and cons_res. -- Remove support for TaskAffinity=yes in cgroup.conf. Adding task/affinity to TaskPlugins in slurm.conf is strongly recommended instead. COMMAND CHANGES (see man pages for details) =========================================== -- Changed the --format handling for negative field widths (left justified) to apply to the column headers as well as the printed fields. -- Invalidate multiple partition requests when using partition based associations. -- scrontab - create the temporary file under the TMPDIR environment variable (if set), otherwise continue to use TmpFS as configured in slurm.conf. -- sbcast / srun --bcast - removed support for zlib compression. lz4 is vastly superior in performance, and (counter-intuitively) zlib could provide worse performance than no compression at all on many systems. -- sacctmgr - changed column headings to "ParentID" and "ParentName" instead of "Par ID" and "Par Name" respectively. -- SALLOC_THREADS_PER_CORE and SBATCH_THREADS_PER_CORE have been added as input environment variables for salloc and sbatch, respectively. They do the same thing as --threads-per-core. -- Don't display node's comment with "scontrol show nodes" unless set. -- Added SLURM_GPUS_ON_NODE environment variable within each job/step. -- sreport - change to sorting TopUsage by the --tres option. -- slurmrestd - do not run allow operation as SlurmUser/root by default. -- "scontrol show node" now shows State as base_state+flags instead of shortened state with flags appended. eg. IDLE# -> IDLE+POWERING_UP. Also "POWER" state flag string is "POWERED_DOWN". -- scrontab - add ability to update crontab from a file or standard input. -- scrontab - added ability to set and expand variables. -- Make srun sensitive to BcastParameters. -- Added sbcast/srun --send-libs, sbcast --exclude and srun --bcast-exclude. -- Changed ReqMem field in sacct to match memory from ReqTRES. It now shows the requested memory of the whole job with a letter appended indicating units (M for megabytes, G for gigabytes, etc.). ReqMem is only displayed for the job, since the step does not have requested TRES. Previously ReqMem was also displayed for the step but was just displaying ReqMem for the job. API CHANGES =========== -- jobcomp plugin: change plugin API to jobcomp_p_*(). -- sched plugin: change plugin API to sched_p_*() and remove slurm_sched_p_initial_priority() call. -- step_ctx code has been removed from the api. -- slurm_stepd_get_info()/stepd_get_info() has been removed from the api. -- The v0.0.35 OpenAPI plugin has now been marked as deprecated. Please convert your requests to the v0.0.37 OpenAPI plugin.