1  Introduction

This chapter briefly introduces the R programming language and how to install R on different operating systems. Furthermore, we will have a look at the installation of RStudio. Finally, we will learn briefly about Fortran.

The book is designed to cater to individuals with a foundational understanding of programming, irrespective of their familiarity with a specific programming language.

1.1 R

R is a powerful and versatile open-source language and environment for statistical computing and graphics (R Core Team, 2023). R was developed to facilitate data manipulation, exploration, and visualisation, providing various statistical and graphical tools.

The R environment is designed for effective data handling, array and matrix operations, and is a well-developed and simple programming language (R Core Team, 2023). Its syntax is concise and expressive, making it an accessible language for newbies and seasoned programmers. R can perform tasks either by executing scripts or interactively. The latter is advantageous for beginners and during the first stages of script development. The interactive environment is accessible through command lines or various integrated development environments (IDEs), such RStudio.

Many factors play an important role in the popularity of R. For example, the ability to produce graphics with a publication’s quality. Simultaneously, the users maintain full control over customising the plots according to their preferences and intricate details. Another significant factor is its ability to be extended to meet the user’s demand. The Comprehensive R Archive Network (CRAN) contains an extensive array of libraries and packages to extend R’s functionality for various tasks. With over 20447 available packages (March 2024), the CRAN package repository is a testament to R’s versatility and adaptability.

Tidyverse (Wickham et al., 2019; Wickham, 2023) is the most popular bundle of R packages for data science. It was developed by Hadley Wickham and his team. The common shared design among tidyverse packages increases the consistency across functions and makes each new function or package a little easier to comprehend and utilise (Wickham et al., 2023).

Many packages have been developed for spatial data analysis to address various challenging tasks. R-spatial provides a rich set of packages for handling spatial or spatiotemporal data. For example, sf (Pebesma, 2018, 2024a; Pebesma & Bivand, 2023) provides simple features access in R, which is a standardized method for encoding spatial vector data. The stars package (Pebesma, 2024b) aims to handle spatiotemporal arrays. Additionally, the terra package (Hijmans, 2024) works with spatial data and has the ability to process large datasets on the disk when loading into memory (RAM) is not feasible.

Furthermore, research produces a large data sets; therefore, it is essential to store them according to the FAIR principles (Findability, Accessibility, Interoperability, and Reuse of digital assets; Wilkinson et al. (2016)). NetCDF and HDF5 are among the most prominent scientific data formats owing to their numerous capabilities. The ncdf4 package (Pierce, 2023) delivers a high-level R interface to data files written using Unidata’s netCDF library. Additionally, rhdf5 (Fischer et al., 2023) provides an interface between HDF5 and R.

Regarding the integration of other programming languages, R has diverse interfaces, which are either a fundamental implementation of R or attainable via another R package (Chambers, 2016). The fundamental interfaces are .Call(), .C(), and .Fortran() to C and Fortran. The development of the Rcpp package (Eddelbuettel et al., 2024) has revolutionised seamless access to C++. Python is also accessible using the reticulate package (Ushey et al., 2024). Finally, the Java interface was granted using the rJava package (Urbanek, 2024).

In conclusion, R is a valuable tool in the Earth System, as it can effectively tackle multifarious scientific tasks and address numerous outstanding research inquiries.

1.2 Fortran

Fortran, short for Formula Translation, is one of the oldest high-level programming languages. It was first developed in the 1950s, and is nonetheless widely used in scientific and engineering applications. Key features of Fortran include its ability to efficiently handle arrays and matrices, making it well-suited for numerical computations. Additionally, Fortran has a simple and straightforward syntax that makes it easy to learn and use, because it is possible to write mathematical formulas almost as they written in mathematical texts (Metcalf et al., 2018).

Fortran has been revised multiple times, with the most recent iteration being Fortran 2023. Another important feature of Fortran is its support for parallel programming, enabling developers to take advantage of multicore processors and high-performance computing architectures.

There are frequently numerous good reasons to integrate different programming languages to achieve tasks. Interoperability with C programming language is a feature that was introduced with Fortran 95 (Metcalf et al., 2018). Given that C is widely used for system-level programming, many of the other languages include support for C. Therefore, the C Application Programming Interface (API) can also be used to connect two non-C languages (Chirila & Lohmann, 2014). For instance, in atmospheric modelling, Fortran is used for its high performance and capcity to handle large data sets, while C is utilised for its efficiency and control over memory usage.

Noteworthy, developing software using Fortran necessitates utilisation of its primitive procedures and developing from scratch. This is because Fortran is not similar to scripting languages (i.e. R) that requires a special environment (Masuda, 2020). However, Fortran is privileged with its persistent backward compatibility, resulting in the usability of countless (legacy) codes written decades ago.

Since R was designed to streamline data analysis and statistics (Wickham, 2015) and Fortran is renowned for its high performance, it makes seance to integrate the two languages. It should be noted that both programming languages present arrays in column-major order, which makes it easier to bridge without causing confusion. Additionally, R is developed using Fortran, C, and R programming languages, and it features .Call and .External() functions that allows users to utilise compiled code from other R packages.

Despite the development of newer programming languages, Fortran remains a popular choice for many scientists and engineers due to its reliability, efficiency, and ability to handle large amounts of data.

1.3 Installation

R

Installation of R differs according to the operating system:

  • The webpage here provides information on installing R according to the Linux distribution.
  • For Windows, the executable can be downloaded from here. Additionally, previous releases of R for Windows exit.
  • For macOS, various releases and versions are accessible here.

RStudio

As mentioned earlier, RStudio is an IDE for R. Although it is the most prominent, other IDEs exist, such as Jupyter Notebook and Visual Studio Code. In this book, RStudio will be the main IDE. To install Rstudio, visit the posit page to download the suitable installers.

Fortran

Fortran doesn’t require an explicit installation, unlike interpreted languages such as R. The source code would be translated to the machine language using the Fortran compiler, and then it can be executed. Therefore, it is important to make sure that a Fortran compiler, i.e., the GNU Fortran (gfortran) compiler, exists in the working machine.

  • In the majority of Linux distributions, the GCC compilers, including gfortran, come pre-installed.
  • For Windows, installing rtools should ensure the existence of gfortran
  • For macOS, binaries for gfortran are available here. Furthermore, more information is available on R for macOS here