File Storage Guide
Data Storage Locations
SRILS Server provides several storage locations for different types of data and use cases:
Home Directory
- Path:
/home/username/ - Purpose: Personal files and private work
- Access: Only accessible by you
- Usage: Store personal scripts, notebooks, and small datasets
- Backup: Regularly backed up
- Quota: Contact admin for quota information
Shared Reference and Tools
- Path:
/share/ - Purpose: Shared reference data and pre-built tools
- Access: Read-only access for all users
- Usage: Access reference genomes, databases, and common tools
- Subdirectories:
/share/reference/- Reference data including genomes, databases, and pre-built indicesgenome/- Reference genomes and pre-built index filescontamination/- Contamination reference databasesgenes/- Gene annotation databasesigv/- IGV reference files
/share/tools/- Pre-built tools and software packages
Accessing Files from Applications
From Jupyter Notebook
import os
# Check current directory
print(os.getcwd())
# List files in current directory
print(os.listdir('.'))
# Navigate to different directories
os.chdir('/home/username/data/')
print(os.listdir('.'))
# Load data from different locations
import pandas as pd
# From home directory
df1 = pd.read_csv('/home/username/data/dataset.csv')
# Access reference data and tools
reference_genome = '/share/reference/genome/hg38.fa'
igv_tool = '/share/tools/IGV_Linux_2.16.0/igv.sh'
# List available reference databases
print(os.listdir('/share/reference/'))
From RStudio
# Check current directory
getwd()
# List files
list.files()
# Set working directory to home data folder
setwd("/home/username/data/")
# Load data from different locations
# From home directory
data1 <- read.csv("/home/username/data/dataset.csv")
# Access reference data
reference_path <- "/share/reference/genome/"
tools_path <- "/share/tools/"
# List available reference files
list.files("/share/reference/genome/")
From Terminal/SSH
# Navigate between directories
cd /home/username/
cd /share/reference/ # Access reference genomes and databases
cd /share/tools/ # Access pre-built tools
# List files and directories
ls -la
ls -lh # Human readable file sizes
# Copy files between locations
cp /home/username/script.py /home/username/scripts/
# Move files
mv /tmp/processed_data.csv /home/username/data/
# Create symbolic links to shared resources
ln -s /share/reference/genome/hg38.fa ~/data/reference_genome.fa
# Access tools from /share/tools/
/share/tools/IGV_Linux_2.16.0/igv.sh
Storage Quotas and Limits
Quota Information
- Home Directory: Limited quota per user
- Shared Reference: Read-only access, managed by administrators
Check Usage
# Check disk usage in your home directory
du -sh ~/
# Check usage of specific directories
du -sh ~/data/
# Check overall disk space
df -h
Managing Large Files
- Compress data: Use
.gz,.zip, or.bz2formats - Use efficient formats: HDF5, Parquet instead of CSV for large datasets
- Archive old data: Move completed work to archive directories
- External storage: For very large datasets, consult with administrators
Data Security and Privacy
Sensitive Data Guidelines
- Personal Data: Store in home directory only
- Confidential Research: Use appropriate access controls in home directory
- Reference Data: Available in
/share/for read-only access - Temporary Processing: Use temporary directories with caution
Access Controls
- Respect file permissions and access controls
- Don’t share access credentials
- Report unauthorized access attempts
- Follow institutional data policies
Data Backup
- Automated Backups: Home directories are backed up regularly
- Version Control: Use Git for tracking changes
- External Backup: For critical data, maintain additional backups
- Recovery: Contact administrators for data recovery needs
Getting Help
File System Issues
- Quota exceeded: Contact admin to increase quota or clean up files
- Permission denied: Check file permissions or contact admin
- File corruption: Restore from backup or contact admin
- Performance issues: Large file operations may be slow during peak hours
Support Contacts
- Technical Issues: Contact SRILS Server administration team
- Data Management: Consult with data management team
- Backup/Recovery: Contact system administrators
- Storage Requests: Submit requests for additional storage space
Related Guides:
- Apps Usage Guide - Main applications overview
- Jupyter Notebook Guide - Jupyter-specific information
- RStudio Server Guide - RStudio-specific information
Need help? Contact the SRILS Server administration team for storage and file management support.