:sequential_nav: next

..  _tutorial-basic-first-encounter:

A First Encounter with SIESTA
*****************************

.. dropdown:: **Do you want to get all tutorial materials on your local machine?**
  :color: primary

  You can get all files and set up your local environment.
  :ref:`Do that now <local_installation>` before proceeding.


This first tutorial provides a first acquaintance with **SIESTA** by
studying one simple molecule, CH4. It covers most of the basic
features and concepts, without worrying too much about issues of
convergence or fine-tuning.

.. button-ref:: tutorial-basic-first-encounter-start
  :color: success
  :outline:
  :expand:
  :shadow:

  Take me to the exercises!


:bdg-info:`background` Background Information
---------------------------------------------

.. dropdown:: Additional information: Slides
  :color: info

  These slides from previous presentations might help improve your
  understanding of the concepts presented in this tutorial:

   * Presentation by `Emilio Artacho <https://siesta-project.org/SIESTA_MATERIAL/Docs/Tutorials/tlv14/slides15.pdf>`_

   As a general background for basis sets, you can watch the 2021 talks by
   Emilio Artacho:

   * Part I: https://www.youtube.com/watch?v=puweAzW9ZNs
   * Part II: https://www.youtube.com/watch?v=AlY2SDb1ta0

   This tutorial will only cover basic, general options for basis set generation.
   We encourage you to jump into the dedicated tutorial for
   :ref:`basis set optimization <tutorial-basic-basis-optimization>` afterwards.


The inputs of SIESTA
''''''''''''''''''''

The essential inputs of SIESTA are the FDF (Flexible Data Format) file and the
pseudopotential file.

FDF file
~~~~~~~~
It is the core input, since it provides information about the system and
sets the values of several parameters regarding the quality of the
calculation, as well as the pseudopotential information.

The **FDF file** (*input.fdf*) is a plain-text input file. It is
structured in name-value pairs—that is, a name characterizing the data and
its value. Units can be specified after the value. These name-value pairs
can be part of more complex structures called blocks
(``%block label [...] %endblock``). The FDF file is case-insensitive,
and characters ``-``, ``_``, and ``.``  in a data label are ignored.

The list of available
`descriptors <https://docs.siesta-project.org/projects/siesta/en/stable/reference/siesta.html#program-options>`_
is extensive; however, most of them have default values. To describe the
system for a calculation, the minimum information needed is provided by the
following descriptors:

* General system information: :ref:`SystemLabel<fdfparam:systemlabel>`,
  :ref:`NumberOfSpecies<fdfparam:NumberOfSpecies>`,
  :ref:`NumberOfAtoms<fdfparam:NumberOfAtoms>`
* Structural information: :ref:`LatticeConstant<fdfparam:LatticeConstant>`,
  :ref:`LatticeVectors<fdfparam:LatticeVectors>`,
  :ref:`AtomicCoordinatesFormat<fdfparam:AtomicCoordinatesFormat>`,
  :ref:`AtomicCoordinatesAndAtomicSpecies<fdfparam:AtomicCoordinatesAndAtomicSpecies>`.

As a part of the FDF file we can find the information regarding the
**basis set** used. SIESTA offers many options for the specification of the
basis set. Two of the key concepts are the cardinality of the set and range of
the orbitals; both can be specified, in most cases, simply with a couple of lines.

* The cardinality of the set, or its size, can be set by
  :ref:`PAO.BasisSize<fdfparam:pao.BasisSize>`. It follows the usual convention of
  `amount of functions per valence orbital <https://en.wikipedia.org/wiki/Basis_set_(chemistry)#Split-valence_basis_sets>`_,
  the possible options being *SZ, SZP, DZ, DZP*... For example, for C, a SZ
  basis set has 4 functions (s+p) and a DZP basis set has 13 functions
  (2*s, 2*p, and a d polarization shell).

* The energy shift is a physical concept determining the increase in energy
  brought about by the confinement of the orbital; the lower the energy shift,
  the larger the cut-off radii of basis orbitals. It is set using
  :ref:`PAO.EnergyShift<fdfparam:pao.EnergyShift>`.

As in any DFT code, the **exchange-correlation functional** must be specified; the
default option is using the local density approximation (LDA) for our calculations.
However, it is also possible to use other functionals, such as those of GGA type.
The information of the exchange-correlation functional should be provided like
this::

    XC.functional GGA
    XC.authors    PBE

Notice that both :ref:`XC.Authors<fdfparam:xc.authors>` and
:ref:`XC.Functional<fdfparam:xc.functional>` are needed, and they must be
consistent, check the
`manual <https://docs.siesta-project.org/projects/siesta/en/stable/reference/siesta.html#exchange-correlation-functionals>`_.
For ``XC.Functional GGA``, the most used variants are BLYP, PBE and PBEsol. For
``XC.Functional VDW``, frequently used functionals are DRSLL, LMKLL and VV. See
the manual for more options on XC functionals.

Pseudopotentials
~~~~~~~~~~~~~~~~
The other essential input of SIESTA is the **pseudopotential** information.
SIESTA uses pseudopotentials to represent the core electron-ion interaction, as
do most plane-wave codes (in contrast to so-called “all-electron” programs).
In particular, the pseudopotentials are of the “norm-conserving” kind. The
choice of a pseudopotential significantly impacts the quality of the
calculation.

SIESTA reads pseudopotentials from different files based on the species
information in :ref:`%block ChemicalSpeciesLabel<fdfparam:ChemicalSpeciesLabel>`.
By default, files are searched for in the current directory. SIESTA accepts
different formats, but **PSML** is the one recommended. Curated databases of
high-quality PSML files are available. In particular, the
`Pseudo-Dojo project <http://www.pseudo-dojo.org/>`_ offers PSML files for
almost the entire periodic table, along with reports detailing the tests
performed during their generation. The database includes variants for scalar
and fully relativistic pseudopotentials using LDA, PBE, and PBEsol functionals.

Generating a new pseudopotential from scratch is beyond the scope of this
tutorial, but more details can be found in :ref:`tutorial-basic-pseudopotentials`.
Note that ATOM generates **psf** files, which are also supported by SIESTA.

The pseudopotential *should be tested*. The method for testing depends on the
type of system being simulated, so reviewing existing literature is strongly
recommended.

The output
''''''''''

SIESTA writes a log of its workings to ``standard output``, which is
recommended to be redirected to an output file. In the standard output,
the entire thinking process of SIESTA can be followed: starting from
reading the input parameters, defining the system, and proceeding to
the self-consistent-field convergence.

System properties can be found in it; however, not all of them are
written by default. More detailed information will be given in the
exercises: see :ref:`below<tutorial-basic-firstencounter-output>`.


.. _tutorial-basic-first-encounter-start:

:bdg-success:`basic` Look at the input files
--------------------------------------------

.. dropdown:: **Are you under a local environment?**
  :color: primary

  Enter directory *01-CH4-Basic*


You will find an input file named *ch4.fdf*, along with the files *C.psml*
and *H.psml* which contain the information about the pseudopotentials.


* Pseudos: :download:`C.psml<work-files/01-CH4-Basic/C.psml>`,
  :download:`H.psml<work-files/01-CH4-Basic/H.psml>`
* Input file: :download:`ch4.fdf<work-files/01-CH4-Basic/ch4.fdf>`

.. dropdown:: A peek at the fdf file

  .. literalinclude:: work-files/01-CH4-Basic/ch4.fdf
    :linenos:

.. System information

We find first the descriptors that specify the system

.. literalinclude:: work-files/01-CH4-Basic/ch4.fdf
  :linenos:
  :lineno-match:
  :start-at: SystemName
  :end-at: %endblock ChemicalSpeciesLabel

.. note::
  You will notice several lines starting with #. These are comments on the file
  and will not be read by SIESTA.


The :ref:`SystemName<fdfparam:SystemName>` is only used for tagging purposes,
while the :ref:`SystemLabel<fdfparam:systemlabel>` will determine the root name
for all output files (i.e., they will all begin by ch4.* in our case).
:ref:`NumberOfAtoms<fdfparam:NumberOfAtoms>` indicates the total number of atoms
in our simulation box, and :ref:`NumberOfSpecies<fdfparam:NumberOfSpecies>`
indicates the different kinds of atoms present in the system.

Pay special attention to the
:ref:`%block ChemicalSpeciesLabel<fdfparam:ChemicalSpeciesLabel>`. In this
block, you assign an index, an atomic number, and a label to each atomic
species. The atomic number is, as usual, the nuclear charge of the atom. The
label will allow SIESTA to recognize the files containing the information about
the pseudopotential and the basis set (when provided); this means that the name
of the label must coincide with the name of the pseudopotential file. The amount
of species in this block must be the one specified by the
:ref:`NumberOfSpecies<fdfparam:NumberOfSpecies>` input.

.. note::
  Options in the fdf file are case-insensitive. This means that
  *NumberOfAtoms* is the same as *numberofatoms* or *NUMBEROFATOMS*.

A few lines below, we find the coordinates for the atoms in the system

.. literalinclude:: work-files/01-CH4-Basic/ch4.fdf
  :linenos:
  :lineno-match:
  :start-at: AtomicCoordinatesFormat
  :end-at: %endblock AtomicCoordinatesAndAtomicSpecies


In this block, each of the lines corresponds to an atom. The first three numbers
are the X, Y and Z coordinates for each atom, and the last (integer) number
indicates the species index in the
:ref:`%block ChemicalSpeciesLabel<fdfparam:ChemicalSpeciesLabel>`. In this case,
there is a Carbon atom (species index = 1) in the position (0.0,0.0,0.0), and
the rest are the four Hydrogens (species index = 2).

The :ref:`AtomicCoordinatesFormat<fdfparam:AtomicCoordinatesFormat>` option
can also be set to *Bohr* (which is the default value), or to *Fractional* if
one desires to set the coordinates as a fraction of the lattice vectors
(see below).

.. Simulation box and Periodic Boundary Condtions
.. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You might have wondered about the existence of this block in the input file:

.. literalinclude:: work-files/01-CH4-Basic/ch4.fdf
  :linenos:
  :lineno-match:
  :start-at:  #Unit
  :end-at: %endblock LatticeVectors


It does not seem to make sense for a 'molecule' calculation. In fact, SIESTA
always uses periodic boundary conditions (PBC); this means that we are doing a
calculation for an infinite collection of regularly spaced molecules. If we want
to simulate an isolated molecule, it is important to have enough distance
between the molecule and its neighboring images. At the very minimum, there
should not be overlap between the orbitals on different image molecules;
however, this size should be bigger for molecules with a dipole, which will
have a long-range interaction with their images in PBC.

The input option :ref:`LatticeConstant<fdfparam:LatticeConstant>` is mandatory
and multiplies the lattice vectors by that constant. In this case, we will end
up with a cubic simulation box of 15 x 15 x 15 Angstrom.


.. dropdown:: Additional information: Periodic crystals
  :color: info

  Imagine that instead of studying methane, we want to study bulk silicon.
  Silicon is a periodic face-centered cubic structure that can be defined
  by the primitive unit cell with ``a = 5.546790 Å``, given by the lattice
  vectors:

    :math:`a_1 = \frac{a}{2}(\hat{y} + \hat{z})`,
    :math:`a_2 = \frac{a}{2}(\hat{x} + \hat{z})`, and
    :math:`a_3 = \frac{a}{2}(\hat{x} + \hat{y})`.

    The unit cell contains two atoms positioned at
    :math:`(-0.125, -0.125, -0.125)` and :math:`(0.125, 0.125, 0.125)`
    in fractional coordinates.

    .. dropdown:: Answer: Can you write the structural parameters to define silicon bulk?
      :color: success

      The descriptors that define the structure are: ``NumberOfSpecies``, ``NumberOfAtoms``,
      ``%block ChemicalSpeciesLabel``, ``LatticeConstant``, ``%block LatticeVectors``,
      ``AtomicCoordinatesFormat`` and ``%block AtomicCoordinatesAndAtomicSpecies``

      Thus, the input will have an structure like the following::

        SystemLabel       Si
        NumberOfSpecies   1
        NumberOfAtoms     2
        %block ChemicalSpeciesLabel
          1  14  Si
        %endblock ChemicalSpeciesLabel

        LatticeConstant         5.546790 Ang
        %block LatticeVectors
          0.00  0.50  0.50
          0.50  0.00  0.50
          0.50  0.50  0.00
        %endblock LatticeVectors

        AtomicCoordinatesFormat     Fractional
        %block AtomicCoordinatesAndAtomicSpecies
        -0.125  -0.125  -0.125     1    28.086
          0.125   0.125   0.125     1    28.086
        %endblock AtomicCoordinatesAndAtomicSpecies

.. Further Options

In the rest of the file you will find options regarding the quality of the
calculation and other performance related variables:

.. literalinclude:: work-files/01-CH4-Basic/ch4.fdf
  :linenos:
  :lineno-match:
  :start-at:  # Basis
  :end-at: SolutionMethod diagon

The specifics of these will be explored in other tutorials; but to provide
a brief overview:

* The first three variables are a rough definition the quality of the
  **basis set**. The number of orbitals per atom is defined by the parameter
  :ref:`PAO.BasisSize<fdfparam:pao.BasisSize>`, and we have set it to select a
  minimal basis (SZ) for a quick, cheap calculation disregarding quality.
* The parameter :ref:`MeshCutoff<fdfparam:mesh.cutoff>` controls the fineness
  of the **real-space grid** used to compute the integrals for the matrix
  elements of the Hamiltonian. A higher value corresponds to a finer grid.
* The next three options control the **performance of the self-consistent field
  (SCF) cycle**: :ref:`MaxSCFIterations<fdfparam:MaxSCFIterations>`,
  :ref:`DM.MixingWeight<fdfparam:DM.MixingWeight>`,
  :ref:`DM.NumberPulay<fdfparam:DM.NumberPulay>`.
* The kind of *solver* used is set by :ref:`SolutionMethod<fdfparam:SolutionMethod>`.

While the parameters specifying the system are mandatory, all other parameters
have some default values and, in principle, it is not necessary to explicitly
include them in the input file. However, it is important to note that the
default values do not always guarantee a fast or a good quality calculation.


:bdg-success:`basic` Running SIESTA
-----------------------------------

To run SIESTA via command line, you just need to do

.. code-block:: console

  $ siesta < ch4.fdf > ch4.out

or simply

.. code-block:: console

  $ siesta ch4.fdf > ch4.out

You will see that a lot of output files are generated in the simulation
directory. We will begin by having a look at the main output file, *ch4.out* (or
any name that you gave to the standard output).


:bdg-success:`basic` A look at the output
------------------------------------------

.. _tutorial-basic-firstencounter-output:

Take a look at *ch4.out* once the program has finished. The file
contains a human-readable account of the doings of SIESTA for
this calculation:

* A header with the version and compilation options, and a copy (in the first
  mode above) or a mention (in the second mode) of the input file.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

      Siesta Version  : 5.0
      Architecture    : ----
      Compiler version: GNU-11.4.0
      Compiler flags  : -fallow-argument-mismatch;-O3 -march=native
      PP flags        : ----
      Libraries       : ----
      Parallelisations: MPI

      [...]

      ************************** Dump of input data file ****************************
      #General system specifications
      SystemName          CH4 molecule
      SystemLabel         ch4
      NumberOfAtoms       5
      NumberOfSpecies     2

      [...]

      # Type of solution
      SolutionMethod diagon
      ************************** End of input data file *****************************


* A summary of existing species and pseudopotential information.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

      initatom: Reading input for the pseudopotentials and atomic orbitals ----------
      Species number:   1 Atomic number:    6 Label: C
      Species number:   2 Atomic number:    1 Label: H

      [...]

      Valence configuration for ps generation: (assumed as above)
      For C, standard SIESTA heuristics set lmxkb to 2
      (one more than the basis l, including polarization orbitals).
      Use PS.lmax or PS.KBprojectors blocks to override.
      For H, standard SIESTA heuristics set lmxkb to 1
      (one more than the basis l, including polarization orbitals).
      Use PS.lmax or PS.KBprojectors blocks to override.


* Details on the basis set generation.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

        <basis_specs>
        ===============================================================================
        C                    Z=   6    Mass=  12.011        Charge= 0.17977+309
        Lmxo=1 Lmxkb= 2    BasisType=split      Semic=F
        L=0  Nsemic=0  Cnfigmx=2
                  i=1  nzeta=1  polorb=0  (2s)

        [...]

                          izeta = 1
                        lambda =    1.000000
                            rc =    4.810269
                        energy =   -0.447136
                        kinetic =    0.945150
            potential(screened) =   -1.392287
              potential(ionic) =   -1.930221
        atom: Total number of Sankey-type orbitals:  1

        atm_pop: Valence configuration (for local Pseudopot. screening):
        1s( 1.00)
        Vna: chval, zval:    1.00000   1.00000

        Vna:  Cut-off radius for the neutral-atom potential:   4.810269

* A summary of the values used in the calculation for the
  most important parameters.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

        siesta: ******************** Simulation parameters ****************************
        siesta:
        siesta: The following are some of the parameters of the simulation.
        siesta: A complete list of the parameters used, including default values,
        siesta: can be found in file out.fdf
        siesta:
        redata: Spin configuration                          = none
        redata: Number of spin components                   = 1

        [...]

        ts: **************************************************************
        ts: Save H and S matrices                           =    F
        ts: Save DM and EDM matrices                        =    F
        ts: Only save the overlap matrix S                  =    F
        ts: **************************************************************

* Information of the geometry and matrix sparsity for a given geometry.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

      outcell: Unit cell vectors (Ang):
            15.000000    0.000000    0.000000
              0.000000   15.000000    0.000000
              0.000000    0.000000   15.000000

      outcell: Cell vector modules (Ang)   :   15.000000   15.000000   15.000000
      outcell: Cell angles (23,13,12) (deg):     90.0000     90.0000     90.0000
      outcell: Cell volume (Ang**3)        :   3375.0000
      <dSpData1D:S at geom step 0
        <sparsity:sparsity for geom step 0
          nrows_g=8 nrows=8 sparsity=1.0000 nnzs=64, refcount: 7>
        <dData1D:(new from dSpData1D) n=64, refcount: 1>
      refcount: 1>
      new_DM -- step:     1

      [...]

      InitMesh: MESH = 108 x 108 x 108 = 1259712
      InitMesh: Mesh cutoff (required, used) =   125.000   143.274 Ry
      New grid distribution [1]: sub = 2

      stepf: Fermi-Dirac step function


* Followed by a full breakdown of the initial, non-converged energy.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

      siesta: Program's energy decomposition (eV):
      siesta: Ebs     =       -86.615317
      siesta: Eions   =       380.812454
      [...]
      siesta: Etot    =      -214.688379
      siesta: FreeEng =      -214.688379


* A log of the self-consistent-field (SCF) cycle.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

            iscf     Eharris(eV)        E_KS(eV)     FreeEng(eV)     dDmax    Ef(eV) dHmax(eV)
        scf:    1     -224.477055     -214.688379     -214.688379  1.130398 -6.803327  1.039357
      timer: Routine,Calls,Time,% = IterSCF        1       0.314  28.15
        scf:    2     -214.708050     -214.704182     -214.704182  0.022332 -6.481242  0.243330
        scf:    3     -214.704223     -214.704283     -214.704283  0.002871 -6.378875  0.148294
        scf:    4     -214.704298     -214.704314     -214.704314  0.001596 -6.231017  0.052566
        scf:    5     -214.704324     -214.704321     -214.704321  0.000517 -6.269207  0.001236
        scf:    6     -214.704321     -214.704321     -214.704321  0.000031 -6.268437  0.000383

      SCF Convergence by DM+H criterion
      max |DM_out - DM_in|         :     0.0000307918
      max |H_out - H_in|      (eV) :     0.0003831179
      SCF cycle converged after 6 iterations

* A full breakdown of the total energy decomposition and forces, and stress.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

      siesta: Program's energy decomposition (eV):
      siesta: Ebs     =       -88.492716
      siesta: Eions   =       380.812454
      [...]
      siesta:         Total =    -214.704321
      siesta:         Fermi =      -6.268437

      [...]
      siesta: Atomic forces (eV/Ang):
      siesta:      1    0.140447    0.140447   -1.039696
      siesta:      2   -2.496202    0.517116    0.806177
      siesta:      3    0.517116   -2.496202    0.806177
      siesta:      4    0.363575    0.363575   -1.130990
      siesta:      5    1.475788    1.475788    0.558249
      siesta: ----------------------------------------
      siesta:    Tot    0.000724    0.000724   -0.000084

      siesta: Stress tensor (static) (eV/Ang**3):
      siesta:     0.001324   -0.000018   -0.000133
      siesta:    -0.000018    0.001324   -0.000133
      siesta:    -0.000133   -0.000133    0.000671

* The last part of the output file includes information on the cell pressure
  and the total electric dipole of the system.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::


      siesta: Cell volume =       3375.000000 Ang**3

      siesta: Pressure (static):
      siesta:                Solid            Molecule  Units
      siesta:          -0.00001205          0.00000000  Ry/Bohr**3
      siesta:          -0.00110663          0.00000026  eV/Ang**3
      siesta:          -1.77300959          0.00042191  kBar
      (Free)E+ p_basis*V_orbitals  =        -214.210260
      (Free)Eharris+ p_basis*V_orbitals  =        -214.210260

      siesta: Electric dipole (a.u.)  =   -0.011736   -0.011736    0.009580
      siesta: Electric dipole (Debye) =   -0.029830   -0.029830    0.024350

* And to finish, literature regarding the simulation done.

  .. dropdown:: Check it!
    :color: muted

    .. code-block::

      cite: Please see "ch4.bib" for an exhaustive BiBTeX file.
      cite: Please clearly indicate Siesta version in published work: 5.1-MaX
      cite: This calculation has made use of the following articles
      cite: which are encouraged to be cited in a published work.
          Primary SIESTA paper
            DOI: www.doi.org/10.1088/0953-8984/14/11/302


.. Other Output files

As you can see, SIESTA generates a ton of output files besides the main log/out
file. Each of these might interest you depending on what you intend to do with
your simulation, but some of the most commonly relevant files are:

* \*.DM files (*ch4.DM*) contain a restart of the Density Matrix. Useful to
  restart already converged or close-to-convergence calculations without doing
  so from scratch.

* \*.XV files (*ch4.XV*) contain a restart of the current geometry and cell
  size. This is particularly useful to restart or continue molecular dynamics
  runs.

* \*.FA files (*ch4.FA*) contain total forces over all atoms. Each line
  corresponds to the cartesian force (x, y and z) over each atom (in eV/Ang).

* \*.EIG files (*ch4.EIG*) contain the eigenvalues of the Hamiltonian for all
  k-points involved. k-points are explored in more detail in
  :ref:`tutorial-basic-kpoint-convergence`.

* \*.times and *TIMES* files (*ch4.times* and *TIMES*) contain timing
  information, which is invaluable when optimizing parameters for fast
  performance.

.. Let's play!

Now that you had a brief overview of how running SIESTA works, let's play a bit
with your CH4 inputs. *See what happens if...*:

  * You change the fineness of the real-space grid. Try an unreasonably low
    value for the the parameter :ref:`MeshCutoff<fdfparam:mesh.cutoff>`
    (around 10 Ry) and check the resulting total energy and forces (you can
    also find the forces in the file *ch4.FA*). Then experiment with a much
    higher one (300 Ry). More on this topic will be covered in
    :ref:`this tutorial<tutorial-basic-grid-convergence>`.

    .. dropdown:: Answer: Change of MeshCutoff.
      :color: success

      Everytime that you run SIESTA in a folder, all the output will be
      overwritten, thus to be able to compare the results of the diferent runs,
      a specific directory should be created for each of them.

      The information or the total energy can be extracted in a more automatic
      way doing a ``grep`` of ``Total =`` of the standard output file and to
      extract the total force look for the line that contains
      ``siesta:    Tot``; for example if you have structured the calculation
      in different folders:

      .. code-block:: console

        $ grep -i  "siesta:    Tot" */output.dat

      .. code-block::

        10ry/output.dat:siesta:    Tot    1.798364    1.798364   -3.227789
        300ry/output.dat:siesta:    Tot   -0.000058   -0.000058   -0.000291

      .. code-block:: console

        $ grep -i  "siesta:    Total =" */output.dat

      .. code-block::

        10ry/output.dat:siesta:         Total =    -230.840670
        300ry/output.dat:siesta:         Total =    -222.538059


  * You can play with the size of the lattice parameters to go from
    'interacting' molecules to effectively isolated ones. Look at the variation
    in the total energy as a function of the cell size, to see how the
    interaction between molecules decreases with increasing distance between
    images. For this non-polar molecule, the interaction should be very small.

    .. dropdown:: Answer: Change of lattice parameter.
      :color: success

      Everytime that you run SIESTA in a folder, all the output will be
      overwritten, thus to be able to compare the results of the diferent runs,
      a specific directory should be created for each of them.

      The information or the total energy can be extracted in a more automatic
      way doing a ``grep`` of ``Total =`` of the standard output file and to
      extract the total force look for the line that contains
      ``siesta:    Tot``. For example, if you have structured the calculation in
      different folders:

      .. code-block:: console

        $ grep -i  "Total =" */output.dat

      .. code-block::

        15AA/output.dat:siesta:         Total =    -222.538221
        35AA/output.dat:siesta:         Total =    -222.538164
        7_5AA/output.dat:siesta:         Total =    -222.538303


:bdg-success:`basic` Simple basis set options
---------------------------------------------

Now let's change some options related to basis sets.

.. dropdown:: **Are you under a local environment?**
  :color: primary

  Enter directory *02-CH4-Basis*

You can find the input files here:

* Pseudos psml: :download:`C.psml<work-files/02-CH4-Basis/C.psml>` and
  :download:`H.psml<work-files/02-CH4-Basis/H.psml>`
* Input file: :download:`ch4.fdf<work-files/02-CH4-Basis/ch4.fdf>`

.. dropdown:: A peek at the fdf file

  .. literalinclude:: work-files/02-CH4-Basis/ch4.fdf
    :linenos:

Change basis-set size or cardinality of the basis. Set ``PAO.BasisSize`` to any
of: DZ, SZP, DZP, TZP. See what happens with the total energy.

.. dropdown:: Answer: Change of basis-set size.
  :color: success

  In order to modify the bases-set size you should edit the line with the
  specific descriptor in the fdf file.

  .. literalinclude:: work-files/01-CH4-Basic/ch4.fdf
    :linenos:
    :lineno-match:
    :start-at:  # Basis
    :end-at: PAO.BasisSize   SZ

  *SZ* should be replaced by DZ, SZP or DZP.

  Everytime that you run SIESTA in a folder, all the output will be overwritten;
  thus, to be able to compare the results of the diferent runs, a specific
  directory should be created for each of them.

  The information of the total energy can be obtained by manually checking the
  standard output (as suggested above) or in a more automatic way doing a
  ``grep`` of ``Total =`` of the standard output file. For example, if you have
  structured the calculation in different folders:

  .. code-block:: console

    $ grep -i "Total =" */output.dat

  .. code-block::

      ene_DZ/output.dat:siesta:         Total =    -223.913569
      ene_DZP/output.dat:siesta:         Total =    -224.380785
      ene_SZ/output.dat:siesta:         Total =    -222.538221
      ene_SZP/output.dat:siesta:         Total =    -222.825577

Let's look at the effect of changing the basis set in some parameters with
physical meaning, like the bond length. In this tutorial you will be doing a
geometry optimization for CH4.

The input file *ch4.fdf* is already set up to optimize the geometry of a
single methane molecule, but until now it was comemented. Let's uncomment the
corresponding lines by removing ``#`` (line 39 and 40).

.. literalinclude:: work-files/02-CH4-Basis/ch4.fdf
  :linenos:
  :lineno-match:
  :start-at:  # Variable
  :end-at: steps  50

At the bottom of the file, you will find two options
used to define the basis set::

  PAO.BasisSize   SZ
  PAO.EnergyShift 250 meV

The idea of this exercise is to see what happens with the total energy and the
bond lengths of the system when modifying these options. The bond lengths
can be found in the *ch4.BONDS_FINAL* file.

The *SystemLabel.BONDS_FINAL* contains the information regarding the atomic
position of each atom of the unit cell and their distance with respect to each
of the other ones.

* See what happens when you increase the size of the basis set to **SZP** and
  **DZP**.

  .. dropdown:: Answer: Effect on the bond's length.
    :color: success

    The bond length will be found in the *ch4.BONDS_FINAL*. As we are studing
    ch4 we will have 1 C and 4 H, for each of them we will know the distances
    to their nearest neighbors. Let's check the output for the SZ case focusing
    on the carbon atom:

      .. code-block::

        Neighbors of:     1 C at:  -0.0150 -0.0150 -0.0133
        4 H                     1.1911  Ang. Really at:  -0.1868 -0.1868  2.2244
        3 H                     1.1913  Ang. Really at:  -0.5086  2.0880 -0.6471
        2 H                     1.1913  Ang. Really at:   2.0880 -0.5086 -0.6471
        5 H                     1.1915  Ang. Really at:  -1.4499 -1.4499 -0.9885

    The index and position of carbon are given by the first line, then, for each
    H atom, we have the index, label and distance to the carbon atom. Also,
    their absolute position is included. Your final results should look similar
    to this:

    .. code-block::

      Size     Total energy               C-H2    C-H3   C-H4   C-H5
      --------------------------------------------------------------
      SZ : Total: -222.996782 eV, Bonds: 1.1913/1.1913/1.1911/1.1915
      SZP: Total: -223.522888 eV, Bonds: 1.1601/1.1601/1.1596/1.1591
      DZP: Total: -225.830785 eV, Bonds: 1.1126/1.1126/1.1146/1.1149

* For **DZP**, try changing the energy shift to **100 meV**, **50 meV** and
  **10 meV**.

  .. dropdown:: Answer: Change of the energy shift.
    :color: success

    .. code-block::

      Shift     Total energy              C-H2    C-H3   C-H4   C-H5
      --------------------------------------------------------------
      100: Total: -225.929648 eV, Bonds: 1.1220/1.1220/1.1222/1.1223
      50 : Total: -225.929092 eV, Bonds: 1.1194/1.1194/1.1196/1.1197
      10 : Total: -225.760365 eV, Bonds: 1.1289/1.1289/1.1292/1.1296


Consider that the *experimental* bond-length for gas methane is **1.087 Å**.

* Which has the biggest effect in the energy, increasing the cardinality/size
  of the basis set, or changing the energy shift?

  .. dropdown:: Answer: Effect in the energy.
    :color: success

    From the first experiment, we can see that the cardinality of the basis set
    induces a change in the range of a few eV, which is much larger than that
    induced by the energy shift. Be aware that this happens when the basis set
    is relatively small; for basis sets larger than DZP (such as TZP) you don't
    usually get a huge improvement. However, going to SZ to SZP, or from SZP to
    DZP, the change in *flexibility* of the basis set is a determining factor in
    the quality of the results.

* In which case would you consider the basis set is "good enough"?

  .. dropdown:: Answer: How do we select the good one?
    :color: success

    We definitely need the DZP basis set to start getting reasonable results.
    But with regards to the energy shift, we can see that 50 meV is already
    reasonable enough to get good results. The total energy seems to be already
    converged to the meV level, and the bond lengths are also reasonable. That
    said, increasing the energy shift is not detrimental in this case, it will
    just increase the time of the calculation.

* Is it ok to compare the results to an experimental bond length? Or would it be
  better to check against an already converged basis set?

  .. dropdown:: Answer: Is the comparison with the experimental possible?
    :color: success

    The question is tricky and depends on what you are going after. If you are
    testing convergence of parameters, then by all means, compare to an already
    converged case, even if it is with another type of basis set (such as
    gaussians or plane-waves). That said, if you are already sure that your
    input parameters are good enough, then comparing to experimental values is
    a good idea: it will give you a hint on how your level of theory (i.e., DFT
    with a given functional) is able to model the real world data. In most cases,
    the best approach is to compare *trends* between experiments and theory,
    rather than absolute values.

:bdg-success:`basic` DFT functional
------------------------------------

.. dropdown:: **Are you under a local environment?**
  :color: primary

  Enter directory *03-CH4-XC-Functional*

You can find the input files here:

* Pseudos LDA: :download:`C.psml<work-files/03-CH4-XC-Functional/C.psml>`
  and :download:`H.psml<work-files/03-CH4-XC-Functional/H.psml>`
* Pseudos GGA: :download:`C.gga.psml<work-files/03-CH4-XC-Functional/C.gga.psml>`
  and :download:`H.gga.psml<work-files/03-CH4-XC-Functional/H.gga.psml>`
* Input file: :download:`ch4.fdf<work-files/03-CH4-XC-Functional/ch4.fdf>`

.. dropdown:: A peek at the fdf file

  .. literalinclude:: work-files/03-CH4-XC-Functional/ch4.fdf
    :linenos:

Up to now we have been implicitly using the local density approximation (LDA)
for our calculations, which is the default option in SIESTA. However, it is also
possible to use other functionals, such as those of GGA type. Edit the
*ch4.fdf* file to include this block::

  XC.functional GGA
  XC.authors    PBE

Notice that both :ref:`XC.Authors<fdfparam:xc.authors>` and
:ref:`XC.Functional<fdfparam:xc.functional>` are needed, and they must be
consistent. For ``XC.Functional GGA``, the most used variants are BLYP, PBE
and PBEsol. For ``XC.Functional VDW``, frequently used functionals are DRSLL,
LMKLL and VV. See the manual for more options on XC functionals.

Run SIESTA and look for possible lines with ``WARNING`` or ``ERROR`` in them.

.. dropdown:: Answer: Change of the functional.
    :color: success

    Even though the program ran successfully, you will find the following warning in
    the output::

      xc_check: WARNING: Pseudopotential generated with LDA PW92 functional

    You are using a GGA functional with a pseudopotential generated using LDA, as
    this is not consistent!. Fortunately, we have produced also the pseudopotentials
    using GGA for you. They are in the files *C.gga.psml* and *H.gga.psml*.

Modify the input file again to use these files by simply changing the *species*
strings::

  %block ChemicalSpeciesLabel
  1  6 C.gga   # Species index, atomic number, species label
  2  1 H.gga   # Species index, atomic number, species label
  %endblock ChemicalSpeciesLabel

Remember, *the species name can be anything you want, but it must be consistent
with the name of the pseudopotential files*. Run the program again and check
whether the warning disappears from the output.

It is worth noting that, as with pseudopotentials, DFT functionals are not easy
to compare between each other systematically. Again, it is worth expending
some time searching the literature for the type of simulations you are trying to
run.

.. dropdown:: Additional information: Visualizing the structure
  :color: info

  To visualize the structure included in the fdf file siesta provides a utility
  that allows you to convert it to stardand crystal formats, such us xsf or POSCAR, which
  then can be open with most of the graphical tools.

  * ``xv2xsf`` [it is available with the usual SIESTA compilation]. Converts
    the XV output, where atomic position and lattice vectors are saved, to an xsf file.

  .. code-block:: console

    $ xv2xsf

  It will ask which is the SystemLabel used, which, for this initial tutorial, it was *ch4*.

  .. code-block::

    Specify  SystemLabel (or 'siesta' if none): ch4
    Found and opened: ch4.XV

  Then, a *ch4.XSF* is genereated, and it can be visualize, for example using `ovito <https://www.ovito.org/>`_.

    .. image:: images/ch4.png
      :width: 400px
      :align: center

  * `sgeom <https://sisl.readthedocs.io/en/latest/scripts/sgeom.html>`_ [it
    is an exerternal tool part to the sisl python library]. Converts either fdf file
    or the XV to POSCAR file, additionally allows you to do some simple modifications
    to the structure.

  .. code-block:: console

    $ sgeom ch4.fdf POSCAR

  or

  .. code-block:: console

    $ sgeom ch4.fdf ch4.xsf


.. Plotting densities
.. ------------------
 You might  have noticed the lines::
  SaveRho .true.                 # -- Output of charge density
  %block LocalDensityOfStates    # -- LDOS ('charge' for an energy window)
  -6.00  -3.00 eV
  %endblock LocalDensityOfStates
 In response to these lines, SIESTA produced two extra files:
 * ``ch3.RHO``: contains the values of the self-consistent electronic
  density on the real-space mesh.
 * ``ch3.LDOS``: contains the 'charge density' associated only to the HOMO
  of the molecule. We had to specify an energy window (-6 to -3 eV) in
  which we know that there is only this state. (We can get this
  information by looking at the ch3.EIG file).
 .. hint::
   You can modify the block ``LocalDensityOfStates`` to plot
   the 'density' associated with different molecular orbitals lying in
   different energy windows.

 More details on how to visualize the charge density and other
 quantities represented on the real-space grid are given in :ref:`this
 how-to <how-to-rho2xsf>`. For this tutorial we will
 use Xcrysden.  Execute::
  rho2xsf < rho2xsf.inp
 to generate a 'ch3.XSF' file that contains both the total charge
 density and the LDOS information. Then::
  xcrysden --xsf ch3.XSF
 will open the Xcrysden window. You need to go into the 'Grid' section
 and set the options to select the data-set (ch3.RHO or ch3.LDOS). For
 the charge density, you can select your preferred combination of the 'up' and 'down'
 charge densities. If you give them factors of '+1' and '-1' you will get the
 spin-density, showing in essence the 'unpaired electron'.