You are here: Home News 2020 HPC Newsletter 2001

HPC Newsletter 2001

HPC Newsletter 2001

Dear colleagues,

We hope you are safe and sound despite the current Coronavirus crisis. As you can imagine, the situation is also very challenging for the HPC team. Fortunately, since we have been operating large scale infrastructures for over a decade, we are trained to do most of the work remotely. Hardware installation and defect management are the only reasons for us to work in the basement. Everything else is done over the management network that each server is equipped with. The team is now strictly divided into a morning shift and an afternoon shift at the Rechenzentrum, complemented by the corresponding home office arrangement. Some of the on-site work gets delayed though, as the coordination with other persons responsible for the network or the rack management are not available all the time. At the same time, scheduled experiments to evaluate new features for the new cluster take longer than originally planned.

Best wishes and stay safe,
Your HPC Team

Table of Contents

NEMO technical advisory board and NEMO steering committee meetings

NEMO-2 application

NEMO AMD (+GPU) nodes for for evaluation

Software Modules Changes

bwSFS Storage for Science

NEMO parallel filesystem extension

BeeOND - BeeGFS on Demand

Advance Warning: NEMO work spaces quota enforcement

Reset of HPC-NEWS mailing list

 

NEMO technical advisory board and NEMO steering committee meetings

Originally, we had planned meetings of the NEMO technical advisory board and the steering committee before Easter. These plans have been invalidated by the Coronavirus crisis. However, we will search for possible alternative dates in the upcoming weeks. The next meetings will certainly have to be held online.

NEMO-2 application

We agreed with the Ministry of Science, Research and the Arts and the bwHPC-S5 team to submit the NEMO-2 grant application for review by the bwHPC steering committee. If approved, the application can be submitted to the DFG in May and funding will be decided upon during a "Baden-Württemberg" day in October. This is similar to the procedure we went through for the NEMO-1 application. In particular, we will need one representative for each of NEMO's principal scientific communites for this day in October.

NEMO AMD (+GPU) nodes for for evaluation

There are four nodes with AMD Rome processors with 128 real cores and one Nvidia T4 GPU per node available on NEMO for evaluation purposes. We would like to get your feedback on your future expectations regarding GPU use, projects which might require such functionality and if the available GPUs match your profile. Scheduling of GPU access is still a challenge, please see the bwHPC Wiki for instructions on use. Additionally, the new bwUniCluster-2.0 also comes with dedicated GPU nodes, see here for further details.

Software Modules Changes

As announced in the last newsletter we have hidden (not yet deactivated) many old software modules, since we didn't get any feedback. They already disappeared from module avail. All modules from the list will be deactivated on 20.04.2020. We can move still needed software modules to the deprecated section. Please inform us if you still are using software on the list and want to access it after the April 20th (). For the deprecated list, you will need to "module load system/modules/deprecated" first.

List of modules

bwSFS Storage for Science

The procurement of the bwSFS Storage-for-Science system coordinated together with Tübingen primarily focusing on the NEMO and BinAC cluster communities is progressing. We put out the "final call" to receive the final offers of the three remaining bidders we were in negotiations with. Hopefully, we can identify a winner of that final round and place a purchase order. Optimally, the project setup and installation negotiations will follow soon and create a workplan to bring the system up. We hope to be able to offer the first filesystem and object store services from Q3 on. In parallel, we work on higher level research data management services like Invenio (check for Zenodo) to allow data publication (with proper metadata and DOI) in the near future. To simplify the proper assignment of data to people, we propose to use ORCID IDs as suggested by the RDM Working Group of the nine Baden-Württemberg universities.

NEMO parallel filesystem extension

NEMO's parallel file system has been upgraded to a net capacity of 960 TB. Additionally, two extra services for meta data operations have been deployed.
NEMO's parallel file system is the work horse for data intensive computing tasks. It was designed to work efficiently on large amounts of transient data, accessible from each compute node through a common name space. The work space mechanism was introduced to prevent the parallel file system from filling up in an uncontrolled way. It forces users to renew their storage allocations every 100 days. However, with NEMO's popularity rising and new users arriving, the demand for storage has increased as well.Therefore, the parallel file system has been upgraded to a net capacity of almost one Petabyte (960 Terabyte to be exact). Operations on file system meta data (creation, deletion or merely checking for existence of files) have been identified as a possibly bottleneck. To improve performance, two additional services for meta-data have been deployed. NEMO users should not take this as an invitation to violate the current soft quotas (20 TB, 1 Million files).

BeeOND - BeeGFS on Demand

With the update of the parallel file system BeeGFS we introduced BeeOND for testing with multi-node jobs. When BeeOND is requested upon job submit, a parallel file system is spanned across all local Solid State Disks on the nodes belonging to the job (220GB per node). Therefore you can access /mnt/beeond during runtime to exchange and store data. But be aware, that when the job exits the file system will be destroyed and data which was not been saved manually at the end of the job is lost. To request BeeOND simply add -v beeond=1 or -v beeond=true to your job script or command line. If you experience problems, please inform us so that we can improve this service.

Advance Warning: NEMO work spaces quota enforcement

The parallel file system was almost full before the upgrade. In fact, since no further upgrades are planned, more restrictive quota handling will be needed in the future. The number of research groups has significantly grown over the last three years and is still increasing. Thus, at the next larger maintenance interval, we will enforce the quota mechanism to better balance the resources between the different groups. We plan to set standard hard quotas at 10TB for each user. These quotas can be extended on individual requests, but a valid explanation has to be given and the extension will be subject to a periodical review.

Reset of HPC-NEWS mailing list

The hpc-news mailing list will be reset to better comply with data protection regulations. To this end, all current subscriptions will be deleted. Only registered NEMO users will be automatically resubscribed. If you are not a NEMO user but are still interested to continue to receive HPC related news, we kindly ask you to resubscribe to the mailing list. Please use an institutional address from your university for that, other addresses will not be accepted.

The hpc-news mailing list is meant to complement the nemo-users mailing list. The nemo-users list deals with questions closely related to NEMO operation. The hpc-news list is meant to report on a larger scale. We thereby acknowledge that not everybody with a generic interest in HPC is necessarily a registered NEMO user.

To opt in to the hpc-news mailing list, simply write an E-Mail to . Remember to use the institutional address from your university. Registered NEMO users will be automatically subscribed.

To opt out of the hpc-news mailing list (e.g. if automatically subscribed as a registered NEMO user), please write an E-Mail to .

It is possible to opt out of the nemo-users mailing list. We advise against this, since you will not receive any mails regarding NEMO operations. In particular, this means no warnings via E-Mail regarding downtimes and operational problems. If you still wish to opt-out, please send an E-Mail to .


HPC Team, Rechenzentrum, Universität Freiburg
http://www.hpc.uni-freiburg.de

bwHPC initiative and bwHPC-S5 project
http://www.bwhpc.de

To subscribe to our mailing list, please send an e-mail to to hpc-news-subscribe@hpc.uni-freiburg.de
If you would like to unsubscribe, please send an e-mail to hpc-news-unsubscribe@hpc.uni-freiburg.de

Previous newsletters: http://www.hpc.uni-freiburg.de/news/newsletters

For questions and support, please use our support address enm-support@hpc.uni-freiburg.de

Filed under: