Difference between revisions of "GIGA:MS"

(Created page with "<div style="text-align: right;"><small>''Sourced from [http://www.giga.ulg.ac.be/cms/c_27138/fr/stockage-de-masse-petabyte GIGA's Intranet].''<BR />''For more informations on...")
 
Line 4: Line 4:
 
==Purpose and scope==
 
==Purpose and scope==
  
 
+
The purpose of this procedure is to describe the principle of using the GIGA mass storage server commonly known as '''[[Petabyte]]''' and to define the modalities of its use. This storage system is connected to the '''ULg scientific network'''.
The purpose of this procedure is to describe the principle of using the GIGA mass storage server commonly known as '''Petabyte''' and to define the modalities of its use. This storage system is located in the basement (floor -3) of building B34 (GIGA Tower) and is connected to the optical fiber in what is known as the '''ULg scientific network'''.
 
  
  
  
 
==Terms of Use==
 
==Terms of Use==
 
 
The GIGA provides for users a centralized storage space to host strictly professional content. In order to provide a quality service to all users, they must have a responsible '''fair use''' as well as to respect this charter.
 
 
 
  
 
===Charter===
 
===Charter===
 
  
 
''The storage space of the GIGA cannot be used to host:''
 
''The storage space of the GIGA cannot be used to host:''
 
 
* Personal data (vacation photos, accounting, ...)
 
* Personal data (vacation photos, accounting, ...)
 
 
* Professional data that is not related to the function within the ULg and/or CHU (complementary occupations, …)
 
* Professional data that is not related to the function within the ULg and/or CHU (complementary occupations, …)
 
 
* Obscene, pornographic, racist or hate speech documents
 
* Obscene, pornographic, racist or hate speech documents
 
 
* Copyrighted material for which you do not have the right to use and/or make copies (music, films, …)
 
* Copyrighted material for which you do not have the right to use and/or make copies (music, films, …)
 
 
* Malicious software
 
* Malicious software
 
 
* Data that may compromise system stability, analysis or vulnerability testing
 
* Data that may compromise system stability, analysis or vulnerability testing
 
 
* Advertising, commercial or illegal action
 
* Advertising, commercial or illegal action
 
 
* Data or files that are not intended for you
 
* Data or files that are not intended for you
 
 
 
''It cannot be used to:''
 
 
* Identity theft and/or authentication on a system where you do not have permissions
 
 
* Violate the law in any way and/or violate the privacy of others.
 
 
* Broadcast or sell data on disk space
 
  
  
Line 51: Line 26:
 
===Quota===
 
===Quota===
  
 +
Each user has a '''personal space''' for storing their data. This space is currently limited to '''''100 GB per user'''''. This quota can be ''reviewed at any time'' by the GIGA.
  
Each user has a personal space for storing their data. This space is currently limited to 100 GB per user. This quota can be reviewed at any time by the GIGA.
+
Inside each user storage space are also found links to group spaces : Administration ('''ADM'''), Platforms ('''PTF''') and Research ('''URT''').  
  
 +
These directories and their contents are subject to '''''separated quotas''''', depending on the different entities. Heads of departments are responsible for defining these quotas and the use of their shared files. ''These folders are accessible from each user's personal space via shortcuts to the different entities''.
  
The storage space is split into three main entities: Administration (ADM), Platforms (PTF) and Research (URT).
+
Being in the ''user'' as well as the ''group'' space, they are all '''backuped every night''' '''''between 0 a.m. and 6 a.m.''''' meaning, ''in the cases listed under this paragraph'', you might accidentally '''create huge amount of computing work''' for the backup system and that would '''''result into an absence of it for at least one day'''''.
 
 
 
 
Added to this, are the personal "home" spaces and the "scratch" (SCR) space linked to the computing grid. Scratch spaces will not be saved.
 
 
 
The working groups of each GIGA entity (Platforms, Research, and Administration) have folders that allow sharing among several users belonging to the same entity. The tree structure for each of these entities is organized as follows:
 
 
 
<center>
 
 
 
{| class="wikitable"
 
 
 
|-
 
 
 
! Entities !! Workgroup !! Description
 
 
 
|-
 
 
 
| '''Administration''' || COO || General coordination of GIGA
 
 
 
|-
 
 
 
|  || EXE || Executive Secretariat
 
 
 
|-
 
 
 
|  || ITG || IT management
 
 
 
|-
 
 
 
|  || SAQ || Quality assurance
 
 
 
|-
 
 
 
| '''Platform''' || BBK || BioBank GIGA-CHU
 
 
 
|-
 
 
 
|  || CRC || Cyclotron
 
 
 
|-
 
 
 
|  || GEN || GIGA-Genomics
 
 
 
|-
 
 
 
|  || HIS || GIGA-ImmunoHistology
 
 
 
|-
 
 
 
|  || IMG || GIGA-Imaging
 
 
 
|-
 
 
 
|  || INF || GIGA-BioInformatics
 
 
 
|-
 
 
 
|  || INT || GIGA-Interatomics
 
 
 
|-
 
 
 
|  || PRO || GIGA-Proteomics
 
 
 
|-
 
 
 
|  || SPF || GIGA-SPF
 
 
 
|-
 
 
 
|  || VIR || GIGA-Viral vectors
 
 
 
|-
 
 
 
|  || ZEB || GIGA-Zebrafish
 
 
 
|-
 
 
 
| '''Research''' || CAN || Cancer
 
 
 
|-
 
 
 
|  || CAR || Cardiovascular sciences
 
 
 
|-
 
 
 
|  || CRC || Cyclotron
 
 
 
|-
 
 
 
|  || CSG || Coma Science Group
 
 
 
|-
 
 
 
|  || GEN || Medical Genomics
 
 
 
|-
 
 
 
|  || III || Infection, Immunity and inflammation
 
 
 
|-
 
 
 
|  || MBD || Molecular Biology of diseases
 
 
 
|-
 
 
 
|  || MED || In Silico Medecine
 
 
 
|-
 
 
 
|  || NEU || Neurosciences
 
 
 
|}
 
 
 
</center>
 
  
 +
'''Any action on data that weights''' a lot might need some special attentions and/or '''contacting support''' before doing it, like for example:
 +
* Renaming folders and/or files
 +
* Copy and/or move data
 +
* Change ownership and/or rights
 +
* Add data
  
These files are subject to a quota depending on the different entities. Heads of departments are responsible for defining these quotas and the use of their shared files. These folders are accessible from each user's personal space via shortcuts to the different entities.
+
In that perspective has been created a special condition : If you create, ''anywhere'' and ''as much as you want'', a directory called '''nobackup''', everything in it will not be backuped and so won't disturb the whole process.
  
  
Line 180: Line 49:
 
Raw data originating from scientific and/or cluster computer cannot be transferred and/or stored in the GIGA storage space as it. It must be the subject of prior consultation with the System manager for the implementation of a specific action.
 
Raw data originating from scientific and/or cluster computer cannot be transferred and/or stored in the GIGA storage space as it. It must be the subject of prior consultation with the System manager for the implementation of a specific action.
  
 +
It is mandatory to encapsulate and compress ('''ZIP''', '''RAR''', '''TAR'''...) this data in a file to facilitate their management and processing by the storage servers.
  
Indeed, by the nature of their trees, the servers can hardly manage and save this data. This affects the overall storage performance and penalizes all users
+
Any filename with '''non-UTF-8 characters''' (''mostly for Windows users'') or '''special characters''' such as '''/\:*?"'<>|''' will not be included in the backup.
  
 +
It is prohibited to create files or folders in '''/gpfs0''' nor in '''/gpfs0/home''' (Linked to '''/home/mass''')
  
It is mandatory to encapsulate and compress (ZIP, RAR, TAR ...) in a file this data to facilitate the management and processing of these data by the storage servers.
+
The users' home directory '''''cannot be used as a temporary space for calculation or cluster document'''''. The number of files generated during the calculations (sometimes tens of thousands) could completely block the storage space. There is a computing space on the [[Grid]] to do this.
  
 
+
The scratch space ('''SCR''') is dedicated to the '''''temporary space of calculation'''''. This space ''will not be backup'' and if there is a problem of free space, it can be deleted of its contents to release space.
Any filename with non-UTF-8 characters or special characters such as '''/\:*?"'<>|''' will not be included in the backup.
 
 
 
 
 
Any breach of this obligation may, in case of abuse, end with the deletion of data and the withdrawal of access to the GIGA storage space.
 
 
 
 
 
It is prohibited to create files or folders in / gpfs0 nor in / gpfs0 / home (= / home / mass)
 
 
 
 
 
The users' home directory cannot be used as a temporary space for calculation or cluster document. The number of files generated during the calculations (sometimes tens of thousands) could completely block the storage space. There is a computing space on the grids to do this.
 
 
 
 
 
The scratch space (SCR) is dedicated to the temporary space of calculation. This space will not be backup and if there is a problem of free space, it can be deleted of its contents to release space.
 
  
  
Line 205: Line 63:
 
==Methods of access==
 
==Methods of access==
  
 
+
Access to the storage is via your ULg's username and password.
The storage space mainly uses the '''SMB/SAMBA/CIFS''' protocol. This protocol is available on most operating systems (Windows, Mac OS X, Linux ...) and allows the sharing of resources (files) on local networks.
 
 
 
 
 
Access to the storage is via the ULg username and password.
 
  
  
  
 
===Windows===
 
===Windows===
 
 
 
* Go to '''My Computer'''
 
* Go to '''My Computer'''
 
 
* Click '''Map Network Drive''' in the menu bar at the top of the window
 
* Click '''Map Network Drive''' in the menu bar at the top of the window
 
 
* Specify a '''Drive''' letter
 
* Specify a '''Drive''' letter
 
 
:- Example: '''Z:'''
 
:- Example: '''Z:'''
 
 
* In '''Folder''' path, enter the storage address followed by your ULg username.
 
* In '''Folder''' path, enter the storage address followed by your ULg username.
 
 
:- Example: '''\\storage.giga.priv\u123456'''
 
:- Example: '''\\storage.giga.priv\u123456'''
 
 
* Notch '''Connect using different credentials''' checkbox
 
* Notch '''Connect using different credentials''' checkbox
 
 
* Click '''Finish'''
 
* Click '''Finish'''
 
 
* A window appears asking for your ULg username and password.
 
* A window appears asking for your ULg username and password.
 
 
* Enter your ULG username after the word '''ULG\'''
 
* Enter your ULG username after the word '''ULG\'''
 
 
:- Example: '''ULG\u123456'''
 
:- Example: '''ULG\u123456'''
 
 
* Enter your password and click '''OK'''
 
* Enter your password and click '''OK'''
 
 
* You should be connected to your home storage
 
* You should be connected to your home storage
  
Line 245: Line 85:
  
 
===Mac OS===
 
===Mac OS===
 
 
 
* In '''Finder''', select '''Go > Connect to Server'''
 
* In '''Finder''', select '''Go > Connect to Server'''
 
 
* In the '''Server Address''', enter '''cifs://''' followed by the storage address and your ULg username
 
* In the '''Server Address''', enter '''cifs://''' followed by the storage address and your ULg username
 
 
:- Example: '''cifs://storage.giga.priv/u123456'''
 
:- Example: '''cifs://storage.giga.priv/u123456'''
 
 
* You will need to enter your ULg username and password
 
* You will need to enter your ULg username and password
 
 
* You should be connected to your home storage
 
* You should be connected to your home storage
  
Line 260: Line 94:
  
 
===Linux===
 
===Linux===
 
 
 
* Depending on the Linux distribution, the connection method may vary.
 
* Depending on the Linux distribution, the connection method may vary.
 
 
* As a rule, you will be able to select '''Connect to a Server''' in the Linux window manager.
 
* As a rule, you will be able to select '''Connect to a Server''' in the Linux window manager.
 
 
* You will then have to enter the storage address followed by your ULg username.
 
* You will then have to enter the storage address followed by your ULg username.
 
 
:- Example: '''smb://storage.giga.priv/u123456'''
 
:- Example: '''smb://storage.giga.priv/u123456'''
 
 
* You will be ask for your ULg username and password.
 
* You will be ask for your ULg username and password.
 
 
* You should be connected to your home storage
 
* You should be connected to your home storage
  
Line 282: Line 109:
  
  
To log in to the VPN, please visit https://www.ulg.ac.be/vpn. You should use applications available at the bottom of the page.
+
To log in to the VPN, please visit https://www.ulg.ac.be/vpn. You should use the applications available at the bottom of the page.
 
 
 
 
 
 
==Backup==
 
 
 
 
 
Personal spaces (home space) as well as shared work folders are saved daily (evening) in a decentralized way.
 
 
 
 
 
A saved file can contain up to 28 versions, one per day for a month. This means that a deleted file is kept for a maximum of 28 days. The time passed, the file is totally lost. It is also possible to return to an older version of a file (within 28 days) but only to the last known version before backup of the previous days. It is therefore not possible to retrieve one of the versions of a modified file several times during the same day. Only the latest versions of the previous days are recoverable.
 
 
 
 
 
The data movements during the '''backup window''', which is between 00h and 6h, must be limited to allow the backup to finish without error.
 
 
 
 
 
File recovery is a slow process and can be heavy in some case. Therefore, it must be used in a measured way and not automatic manner. Recovery requests should be made as the last resort and should not be repetitive in order to provide the same service to all users.
 
 
 
 
 
In each folder there may be a '''nobackup''' folder. Anyone can create this folder anywhere in the folder tree. The contents of this file will not be included in the backup.
 
 
 
 
 
Any action on data that weight 1Tb or more like for example:
 
 
 
* Renaming folders and/or files
 
 
 
* Copy and/or move data
 
 
 
* Change ownership and/or rights
 
 
 
* Add data
 
 
 
 
 
Should be reported to the IT department, to not influence the smooth operation of the next night's operations.
 
 
 
 
 
 
 
==Migration of data (HSM)==
 
 
 
 
 
HSM is a data storage system that automatically transfers data between high cost and low cost storage media. This function exists because high-speed storage units, such as hard disk drives, are more expensive than slower units, such as magnetic tape drives. HSM can therefore be used to store all less used data on slower units and then recall them to faster disk drives only when necessary. These transfers (migrations) can take place on the basis of pre-established rules for each file. These rules are based mostly on the date of the last access. For example, you migrate all files that are no longer accessed for more than 6 months. The data are only on another medium, these are of course always accessible.
 

Revision as of 13:50, 15 November 2017

Sourced from GIGA's Intranet.
For more informations on this topic (user-quotas), please contact Ersen Yasar


Purpose and scope

The purpose of this procedure is to describe the principle of using the GIGA mass storage server commonly known as Petabyte and to define the modalities of its use. This storage system is connected to the ULg scientific network.


Terms of Use

Charter

The storage space of the GIGA cannot be used to host:

  • Personal data (vacation photos, accounting, ...)
  • Professional data that is not related to the function within the ULg and/or CHU (complementary occupations, …)
  • Obscene, pornographic, racist or hate speech documents
  • Copyrighted material for which you do not have the right to use and/or make copies (music, films, …)
  • Malicious software
  • Data that may compromise system stability, analysis or vulnerability testing
  • Advertising, commercial or illegal action
  • Data or files that are not intended for you


Quota

Each user has a personal space for storing their data. This space is currently limited to 100 GB per user. This quota can be reviewed at any time by the GIGA.

Inside each user storage space are also found links to group spaces : Administration (ADM), Platforms (PTF) and Research (URT).

These directories and their contents are subject to separated quotas, depending on the different entities. Heads of departments are responsible for defining these quotas and the use of their shared files. These folders are accessible from each user's personal space via shortcuts to the different entities.

Being in the user as well as the group space, they are all backuped every night between 0 a.m. and 6 a.m. meaning, in the cases listed under this paragraph, you might accidentally create huge amount of computing work for the backup system and that would result into an absence of it for at least one day.

Any action on data that weights a lot might need some special attentions and/or contacting support before doing it, like for example:

  • Renaming folders and/or files
  • Copy and/or move data
  • Change ownership and/or rights
  • Add data

In that perspective has been created a special condition : If you create, anywhere and as much as you want, a directory called nobackup, everything in it will not be backuped and so won't disturb the whole process.


Data

Raw data originating from scientific and/or cluster computer cannot be transferred and/or stored in the GIGA storage space as it. It must be the subject of prior consultation with the System manager for the implementation of a specific action.

It is mandatory to encapsulate and compress (ZIP, RAR, TAR...) this data in a file to facilitate their management and processing by the storage servers.

Any filename with non-UTF-8 characters (mostly for Windows users) or special characters such as /\:*?"'<>| will not be included in the backup.

It is prohibited to create files or folders in /gpfs0 nor in /gpfs0/home (Linked to /home/mass)

The users' home directory cannot be used as a temporary space for calculation or cluster document. The number of files generated during the calculations (sometimes tens of thousands) could completely block the storage space. There is a computing space on the Grid to do this.

The scratch space (SCR) is dedicated to the temporary space of calculation. This space will not be backup and if there is a problem of free space, it can be deleted of its contents to release space.


Methods of access

Access to the storage is via your ULg's username and password.


Windows

  • Go to My Computer
  • Click Map Network Drive in the menu bar at the top of the window
  • Specify a Drive letter
- Example: Z:
  • In Folder path, enter the storage address followed by your ULg username.
- Example: \\storage.giga.priv\u123456
  • Notch Connect using different credentials checkbox
  • Click Finish
  • A window appears asking for your ULg username and password.
  • Enter your ULG username after the word ULG\
- Example: ULG\u123456
  • Enter your password and click OK
  • You should be connected to your home storage


Mac OS

  • In Finder, select Go > Connect to Server
  • In the Server Address, enter cifs:// followed by the storage address and your ULg username
- Example: cifs://storage.giga.priv/u123456
  • You will need to enter your ULg username and password
  • You should be connected to your home storage


Linux

  • Depending on the Linux distribution, the connection method may vary.
  • As a rule, you will be able to select Connect to a Server in the Linux window manager.
  • You will then have to enter the storage address followed by your ULg username.
- Example: smb://storage.giga.priv/u123456
  • You will be ask for your ULg username and password.
  • You should be connected to your home storage


Accessibility of data

The data can be accessed from anywhere in the institution from the network (wired or wireless) of the ULG or CHU. To access data from the outside, you need to log to the ULg VPN.


To log in to the VPN, please visit https://www.ulg.ac.be/vpn. You should use the applications available at the bottom of the page.