GIS Data in Puhti

The open GIS data  is stored in Puhti: /appl/data/geo

Currently (Jan 2020) there are following datasets:

  • Paituli data. Paituli includes datasets from  Agency for rural affairs, Finnish Meteorological Institute, Institute for the languages of Finland, National Land Survey, Natural resource institute, Population Register Center, Statistics Finland, Traffic Agency and University of Helsinki.
    • Full list of Paituli datasets
    • All Paituli datasets have a readme-file with a link to Etsin dataset descriptions and terms of use. In general all datasets have an open license, but the exact terms vary a bit.
    • If in trouble finding some file, you can also use Paituli download page as help. You can see the dataset path under links (crop the beginning) or you can download the file list with "Download list of files" if the dataset has a lot of mapsheets.
    • NLS normal color ortho images are not available in Puhti, but the infrared ones are.


  • Additionally:
    • NLS
      • Dem 2m and 10m have virtual rasters, see section below
      • Stereoclassified lidar data has been slightly modified. The original NLS data had mistakes in headers, these have been fixed. Additionally lax-index files have been added.
      • Lidar data, the automatically classified version (last update nov 2018, only in Taito)
    • LUKE
      • Multi-source national forest inventory, 2013, 2015 and 2017. LUKE license changed in Aug 2019 to CC BY 4.0.
    • SYKE
    • Satellite mosaics produced by SYKE and FMI in Paikkatietoalusta project, added in 2020

NLS 2m DEM, lidar, infrared ortophotos, all SYKE datasets and satellite mosaics are updated in Puhti automatically every Monday.


Readymade virtual rasters in Puhti

There are a ready made virtual rasters for 2m and 10m elevation models and infrared ortophotos in Puhti. There are two variants of virtual rasters for these elevation models: 

  1. The direct virtual rasters contain directly the source tif images without any hierarchical structure, overviews or pre-calculated statistics. The direct virtual raster is meant for using only in scripts (it should not be opened in QGIS etc):
    • 2m DEM: /appl/data/geo/mml/dem2m/dem2m_direct.vrt
    • 10m DEM: /appl/data/geo/mml/dem10m/dem10m_direct.vrt
    • infrared orthophotos: /appl/data/geo/mml/orto/infrared_3067/infrared_euref_direct.vrt
  2. The hierarchical virtual raster is mainly for viewing purposes for example with QGIS. It has a hierarchical structure where a virtual raster for each folder contains all the data stored in that folder and it's subfolders. For example if I wanted to view 2m DEM data from area of mapsheet M4, you would simply open the /appl/data/geo/mml/dem2m_vrt/2008_latest/M4/M4.vrt file. That would load virtual rasters for mapsheets M41, M42, M43 and M44, which in turn contain information about the actual tif files. If you wanted to view the whole 2m DEM dataset, you would simply open the dem2m_hierarchical.vrt file.

    The hierarchical file structure also contains statistics (min, max, mean, stddev) and overviews for each vrt file, which enables a fairly responsive viewing of the entire 2m or 10m DEM datasets for example in QGIS. A good tip is to enable the Raster toolbar through View->Toolbars which allows to easily adjust the color scale to min-max of current view extent. This way the whole dataset can be easily viewed at different zoom levels.

    You may use the lowest level virtual raster (for example M41 in the 2m DEM) also in scripts, higher level virtaul rasters may cause computational errors.
  • 2m DEM: /appl/data/geo/mml/dem2m/dem2m_hierarchical.vrt
  • 10m DEM: /appl/data/geo/mml/dem10m/dem10m_hierarchical.vrt

The optimal performance for analysis will always be achieved by creating your own virtual raster that covers only your study area.

Creating virtual rasters for specific area

There is a ready made python script for creating virtual rasters for your study area from datasets available in Puhti. The script is located at /appl/soft/geo/vrt/ It's used in the following way:

python /appl/soft/geo/vrt/ dataset polygon_file output_directory

available dataset values are:

  • dem2m - 2m DEM from NLS
  • dem10m - 10m DEM from NLS
  • demCombined - DEM covering whole Finland using 2m DEM whenever it's available and interpolating rest of the areas to 2m resolution from 10m DEM using bicubic interpolation.

with optional arguments:

  • -i: create individual vrt for each polygon, default behavior is to create one vrt covering all polygons.
  • -o: create overviews
  • -p: output name prefix

The script utilizes gdal and geopandas python modules so geoconda module should be loaded prior to using the script. For more details see:

Example: create one virtual raster covering my_study_area.shp polygon file from 2m DEM and save it to /scratch/projectX/output/2m_dem_virtual_raster.vrt. The list of files included in the .vrt (2m_dem_file_list.txt) is also saved to the same place,  as well as external overview file (2m_dem_virtual_raster.vrt.ovr).

python /appl/soft/geo/vrt/ dem2m my_study_area.shp /scratch/projectX/output -o -p 2m_dem