Geocomputing
CSC offers a wide range of high quality computing services for various fields of research. Using our services might make sense if:
- Computing something takes more than 2-4 hours
- You need more memory
- You are working with very large datasets
- You need to work with GIS data or software already available in Puhti
- You want to keep your local workstation for regular usage and do computation elsewhere
- You need a server computer (cPouta)
- You need many computers with the same setup for courses (Noppe)
- You are running GPU or MPI programs
Our supercomputers have more storage space and memory than regular workstations. In general, the performance of one CPU core is not much higher than that of normal desktop computer, but there are thousands of them when a desktop computer only has a few. Using our supercomputers could significantly reduce computing time if the analysis can run in parallel on several CPUs or GPUs. The GPUs of our supercomputers are widely used for deep learning with spatial data.
Use of CSC’s computing environments is mostly free-of-charge for users from Finnish universities and state research institutes. When the conditions for free-of-charge use are not met, it is possible to use many of our services by purchasing them. Companies looking into supercomputer resources are encouraged to use LUMI.
For geoinformatics users, Puhti and LUMI supercomputers, as well as cPouta cloud service are especially useful computing environments.
Supercomputers for geocomputing
Puhti supercomputer could be the first option to consider for geoinformatics users. Puhti provides a ready environment for computing, you just need to log in and start working! However, note that a supercomputer is not a desktop computer or web application, and using it requires some Linux command-line skills.
LUMI is one of the fastest supercomputers in the world, providing especially a lot of GPU resources, but also CPU. For spatial data research, LUMI could be valuable in international projects or public–private collaboration.
Tools
Many common GIS software are already installed on CSC supercomputers. Puhti has the widest selection of preinstalled GIS tools, but some are available also on LUMI and Mahti. The most used are R and Python with preinstalled spatial packages. Often also machine learning libraries and mathematics/statistical tools are relevant for geoinformatics. When using some of the preinstalled software, a related module must always be loaded first, please see the linked pages above for details about the specific software.
It is also possible to install software yourself for your project. The easiest option for installations is often the Tykky container wrapper tool, which is also available on LUMI. In case of Python and R, it is possible to add own packages on top of preinstalled environments.
Supercomputers run Linux as the operating system, so software available only for Windows, such as ArcGIS or Erdas, cannot be installed. Also, server-type software, including PostGIS or GeoServer, are not suitable for supercomputers. For server tools, you can use our cPouta service.
CSC Service Desk is happy to help with installing tools on our supercomputers.
Data
Puhti has a shared data folder for spatial data, which is available for all users and includes the most important open GIS datasets of Finland. These include for example:
- NLS DEM
- Lidar data and topographic database
- LUKE VMI
- All SYKE open data
Mahti and LUMI do not have local spatial data.
You can also move your own data to Puhti. There are different directories available for various purposes. In Scratch directory, each project has a 1 TiB quota by default, which can be extended on request. Scratch is cleaned up periodically, so keep a copy of your important files also in Allas object storage. GDAL and all other software based on it support reading and writing data directly from Allas.
Working with a supercomputer
Normally, scripts are used to work with a supercomputer. The most commonly used scripting languages for spatial analysis are R, Python and Bash scripts. There are a few different options for using supercomputers in practical terms, see Geocomputing on supercomputer, Parallel computing for more details.
Most often scripts are run on supercomputers as batch jobs. Batch jobs enable organizing and balancing the use of computing resources between different users. We provide some example scripts for spatial data analysis in Puhti, including:
- Allas object storage
- Python
- R
- FORCE
- GDAL
- GRASS
- PDAL
- SNAP
- Machine learning
These examples include also batch job scripts, and some of the examples include similar solutions for both serial and parallel jobs. Also, GeoPortti provides some longer examples on GitHub.
Interactive usage
CSC supercomputers also have web interfaces for testing or using software with a graphical user interface (GUI). This way you can use, for example, CloudCompare, Jupyter, QGIS, SNAP, GRASS GIS, SagaGIS, Zonation, RStudio or Spyder for Python.
cPouta for geocomputing
cPouta is an Infrastrucutre-as-a-Service (IaaS) offered by CSC. It offers different hardware setups where the user has to install everything needed from scratch (operating system, software, network configuration, etc.). This gives the user the freedom to install custom computing environments. cPouta is ideal for running server-type software, such as PostGIS and GeoServer. Expert users can also set up their own computing clusters. cPouta requires server administration, software installation and Linux skills.
CSC provides some example GIS installation guidelines for cPouta, including for example ArcGIS Server, Agisoft Metashape, PostGIS, and GeoServer. GeoPortti provides guidelines for installing OpenDroneMap.
Practically, cPouta supports only different versions of Linux, so setting up ArcGIS Pro or Desktop is not possible. The easiest way to use some ArcGIS functionality is to install ArcGIS Server for Linux on cPouta and run ArcPy scripts.
Next steps for starting with CSC computing services
- Getting started with supercomputing at CSC (Docs CSC)
- To learn more about geocomputing and using supercomputers, see:
- Geocomputing on the supercomputer course materials (suitable also for self-learning)
- CSC computing environment self-learning course
- See also the Skills development section on the Geoinformatics page
- If using remote sensing data, take a look at our Earth Obervation guide
- Join gis-hpc mailing list to get the latest news about CSC Geocomputing
If you have any questions or comments, or you need some other software or data, please contact CSC Service Desk.
Read more about the services
Allas
Data storage service for research projects
Use Allas service for storing and sharing all types of data during your research project
Puhti
Supercomputer
A supercomputer for a wide range of use cases from data analysis to medium scale computation
Mahti
Supercomputer
Mahti is a supercomputer geared towards medium to large scale computations and simulations
LUMI
Supercomputer
LUMI is a leading platform for AI and one of the EuroHPC world-class supercomputers
cPouta
Community Cloud Service
cPouta provides quick and easy access to compute resource in cloud service