# Basic information Basic information includes data source, file path and functions. ## Data source In data source section, please provide: 1. The name of database, molecule, isotopologue and dataset. 2. The molecule ID and isotopologue ID. ### Molecule database **For ExoMol database** The name of `Database`, `Molecule`, `Isotopologue`, and `Dataset` are necessary. The molecule and isotopologue ID `SpeciesID` can be set as `0` or any other integers. *Example* ```bash # Data source # Database ExoMol Molecule H2O Isotopologue 1H2-16O Dataset POKAZATEL SpeciesID 11 ``` **For ExoMol database** The name of `Database`, `Molecule`, and `Isotopologue` are necessary. The molecule and isotopologue ID `SpeciesID` can be set as `0` or any other integers. *Example* ```bash # Data source # Database ExoMolHR Molecule NO Isotopologue 14N-16O SpeciesID 81 ``` **For HITRAN and HITEMP databases** The `Database` name, `Molecule` name, and the molecule and isotopologue ID `SpeciesID` are necessary. The first two digits of `SpeciesID` are molcule ID and the third digit is isotopologue ID and there is no blank between molecule ID and isotopologue ID. `SpeciesID` can be found from [HITRANOnline Isotopologue Metadata](https://hitran.org/docs/iso-meta/). The name of `Isotopologue` and `Dataset` can be set as 'none' or any other strings. *Example* ```bash # Data source # Database HITRAN Molecule CO2 Isotopologue none Dataset none SpeciesID 21 ``` ```bash # Data source # Database HITRAN Molecule C2H2 Isotopologue abcd Dataset EfGh SpeciesID 261 ``` ### Atom and ion database **For ExoAtom database** The name of `Database`, `Atom`, and `Dataset` are necessary. Only provide two datasets: `NIST` and `Kurucz`. *Example* ```bash # Data source # Database ExoAtom Atom Ar Dataset NIST ``` ```bash # Data source # Database ExoAtom Atom Al Dataset Kurucz ``` ## File path File path section records the file path for both reading and saving. `ReadPath` and `SavePath` are folder paths and should start with `/`, but don't need to end with `/`. `LogFilePath` is the file path of the log file, not the folder path. ✅ /aaa/bbb/ccc ✅ /aaa/bbb/ccc/ ✅ /aaa/bbb/ccc/ddd.log **For ExoMol database** `ReadPath` is the folder path of input files when the input line list files path is stored in the following format. `ReadPath`/`Molecule`/`Isotopologue`/`Dataset`/`Isotopologue`__`Dataset`.states.bz2 /mnt/data/exomol/exomol3_data/MgH/24Mg-1H/XAB/24Mg-1H__XAB.states.bz2 ``` └── exomol3_data ├── C2H2 ├── CO2 │ ├── 12C-16O2 │ ├── 13C-16O2 │ ├── ... │ ├── 12C-16O2__air.broad │ └── 12C-16O2__self.broad ├── MgH │ ├── 24Mg-1H │ │ ├── Yadin │ │ └── XAB │ │ ├── 24Mg-1H__XAB.def │ │ ├── 24Mg-1H__XAB.def.json │ │ ├── 24Mg-1H__XAB.pf │ │ ├── 24Mg-1H__XAB.states.bz2 │ │ └── 24Mg-1H__XAB.trans.bz2 │ ├── 25Mg-1H │ │ ├── Yadin │ │ └── XAB │ │ ├── 25Mg-1H__XAB.def │ │ ├── 25Mg-1H__XAB.def.json │ │ ├── 25Mg-1H__XAB.pf │ │ ├── 25Mg-1H__XAB.states.bz2 │ │ └── 25Mg-1H__XAB.trans.bz2 │ └── 26Mg-1H ├── ... │ ``` `SavePath` is the folder path for saving all results obtained by the PyExoCross program. `LogFilePath` is the file path of the log file, the program can record the log output automatically. *Example* ```bash # File path # ReadPath /mnt/data/exomol/exomol3_data/ SavePath /home/jingxin/data/pyexocross/ LogFilePath /home/jingxin/data/pyexocross/log/MgH_ExoMol.log ``` **For ExoMolHR database** `ReadPath` is the folder path of input files when the input line list files path is stored in the following format. For ExoMolHR line list files, the standard filename formats are: `ReadPath`/`Molecule`/`Isotopologue`/date__`Isotopologue`__T.csv /mnt/data/exomolhr/exomolhr_results/NO/14N-16O/20260311080614__14N-16O__1000K.csv `ReadPath`/`Molecule`/`Isotopologue`/`Isotopologue`__`Dataset`.pf /mnt/data/exomol/exomol3_data/NO/14N-16O/14N-16O__XABC.pf However, users can also rename the CSV filenames in any formats as long as `Isotopologue` is included in the filename. ✅ date__`Isotopologue`__T.csv ✅ `Molecule`__`Isotopologue`.csv ✅ `Molecule`__`Isotopologue`__T.csv ``` └── exomolhr_results ├── C2H2 ├── CO2 │ ├── 12C-16O2 │ ├── 13C-16O2 │ ├── ... │ ├── 12C-16O2__air.broad │ └── 12C-16O2__self.broad ├── NO │ └── 14N-16O │ ├── 14N-16O__XABC.pf │ └── 20260311080614__14N-16O__1000K.csv ├── MgH │ ├── 24Mg-1H │ │ ├── 24Mg-1H__XAB.pf │ │ └── 20260311080614__24Mg-1H__1000K.csv │ ├── 25Mg-1H │ │ ├── 25Mg-1H__XAB.pf │ │ └── MgH__25Mg-1H__1000K.csv │ └── 26Mg-1H │ ├── 26Mg-1H__XAB.pf │ └── MgH__26Mg-1H.csv ├── ... │ ``` `SavePath` is the folder path for saving all results obtained by the PyExoCross program. `LogFilePath` is the file path of the log file, the program can record the log output automatically. *Example* ```bash # File path # ReadPath /mnt/data/exomolhr/exomolhr_results/ SavePath /home/jingxin/data/pyexocross/ LogFilePath /home/jingxin/data/pyexocross/log/NO_ExoMolHR.log ``` **For ExoAtom database** `ReadPath` is the folder path of input files when the input line list files path is stored in the following format. `ReadPath`/`Atom`/`Dataset`/`Atom`__`Dataset`.states /mnt/data/exoatom/exoatom_data/Li/NIST/Li__NIST.states `SavePath` is the folder path for saving all results obtained by the PyExoCross program. `LogFilePath` is the file path of the log file, the program can record the log output automatically. ``` └── exoatom_data ├── Al ├── Al_p ├── Li │ ├── NIST │ │ ├── Li_p__NIST.adef.json │ │ ├── Li_p__NIST.pf │ │ ├── Li_p__NIST.states │ │ └── Li_p__NIST.trans │ └── Kurucz │ ├── Li_p__Kurucz.adef.json │ ├── Li_p__Kurucz.pf │ ├── Li_p__Kurucz.states │ └── Li_p__Kurucz.trans ├── Li_p │ ├── NIST │ │ ├── Li__NIST.adef.json │ │ ├── Li__NIST.pf │ │ ├── Li__NIST.states │ │ └── Li__NIST.trans │ └── Kurucz │ ├── Li__Kurucz.adef.json │ ├── Li__Kurucz.pf │ ├── Li__Kurucz.states │ └── Li__Kurucz.trans ├── ... │ ``` *Example* ```bash # File path # ReadPath /mnt/data/exoatom/exoatom_data/ SavePath /home/jingxin/data/pyexocross/ LogFilePath /home/jingxin/data/pyexocross/log/Li_NIST.log ``` **For HITRAN and HITEMP databases** `ReadPath` is the folder path of input files when the input line list files path is stored in the following format. `ReadPath`/`Molecule`/`Isotopologue`/`Molecule`__`Isotopologue`.par /path/HITRAN/NO/14N-16O/NO__14N-16O.par `SavePath` is the folder path for saving all results obtained by the PyExoCross program. `LogFilePath` is the file path of the log file, the program can record the log output automatically. ``` └── HITRAN ├── H2O ├── CO2 ├── NO │ ├── 14N-16O │ │ ├── NO__14N-16O.par │ │ └── NO__14N-16O.pf / NO__14N-16O.txt │ ├── 15N-16O │ │ ├── NO__15N-16O.par │ │ └── NO__15N-16O.pf / NO__15N-16O.txt │ └── 14N-18O │ ├── NO__14N-18O.par │ └── NO__14N-18O.pf / NO__14N-18O.txt ├── NO_p │ └── 14N-16O_p │ ├── NO_p__14N-16O_p.par │ └── NO_p__14N-16O_p.pf / NO_p__14N-16O_p.txt ├── ... │ ``` *Example* ```bash # File path # ReadPath /home/jingxin/data/HITRAN/ SavePath /home/jingxin/data/pyexocross/ LogFilePath /home/jingxin/data/pyexocross/log/CO2_HITRAN.log ``` ## Functions In current version, *PyExoCross* can convert data format between the ExoMol (ExoMol, ExoMolHR, and ExoAtom databases) and the HITRAN (HITRAN and HITEMP database) formats. *PyExoCross* also implements the computations of other useful functions including partition functions, specific heats, cooling functions, radiative lifetimes, oscillator strengths, LTE and non-LTE stick spectra and cross sections. *PyExoCross* can only compute stick spectra and cross sections for ExoMolHR database. If users use `.inp` input files: *PyExoCross* provides computations of cooling function, oscillator strengths, LTE and non-LTE stick spectra and cross sections for data from the HITRAN format databases. If you want to use the other functions, please convert the data format from the HITRAN format to the ExoMol format at first and then treat the data as the ExoMol data to use *PyExoCross*. If users use Python package: *PyExoCross* can convert data format automatically if required. Use this function or not: `0` means no; `1` means yes. If the value of a function's second column is `0`, then there is no need to do any changes in this function's own section, the program won't process data with this function. *Example* ```bash # Functions # Conversion 0 PartitionFunctions 0 SpecificHeats 1 CoolingFunctions 0 Lifetimes 0 OscillatorStrengths 0 StickSpectra 0 CrossSections 1 ``` ## Cores and chunks Please provide the number of cores `NCPU` and the size of chunks `ChunkSize` of the quantum numbers when you use *PyExoCross* uses multiprocessing. The program will run on different cores together. `NCPUtrans`: The number of cores for processing each transitions file. `NCPUfiles`: The number of transitions files for processing at the same time. `ChunkSize`: The program splits each transitions file to many chunks when reading and calculating. `ChunkSize` is the size of each chunk. `RunMode`: Choose to run the program in CPU or GPU mode. `GPUBackend`: GPU backend selection (only used when `RunMode=GPU`): - `'AUTO'` (recommended): `PyTorch-CUDA -> CuPy-CUDA -> MPS -> CPU fallback` - `'CUDA'`: `PyTorch-CUDA -> CuPy-CUDA -> MPS -> CPU fallback` - `'PyTorch-CUDA'`: NVIDIA PyTorch CUDA only; otherwise CPU fallback - `'CuPy-CUDA'`: NVIDIA CuPy CUDA only; otherwise CPU fallback - `'MPS'`: Apple Metal (MPS) only; otherwise CPU fallback **Note** - MPS uses float32 kernels, so results can differ slightly from CPU/CUDA float64. - If no compatible GPU backend is available, PyExoCross falls back to CPU automatically. - GPU acceleration is available for cooling functions, stick spectra, cross sections, and stick spectra + cross sections. `GPUBatchLines`: The number of lines to process in each batch on the GPU. Default value is 8192. `GPUBatchGrid`: The number of grids to process in each batch on the GPU. Default value is 256. **Note** Some suggestions on setting the number of `NCPUtrans` and `NCPUfiles`. 1. `NCPUtrans` $*$ `NCPUfiles` $\leq$ Your cores number 2. Some cases (depend on `.trans` files): | `.trans` files size | `.trans` files number | `NCPUtrans` ? `NCPUfiles` | | :-----------------: | :-------------------: | :-------------------------------: | | Very small | Not small | `NCPUtrans` $=1<$ `NCPUfiles` | | Not large | Very large | `NCPUtrans` $<$ `NCPUfiles` | | Not large | Not large | `NCPUtrans` $\approx$ `NCPUfiles` | | Very large | Small | `NCPUtrans` $>1=$ `NCPUfiles` | *Example* ```bash # Cores and chunks # NCPUtrans 2 NCPUfiles 4 ChunkSize 500000 RunMode CPU # CPU(default) or GPU GPUBackend AUTO # AUTO(default): PyTorch-CUDA -> CuPy-CUDA -> MPS -> CPU fallback GPUBatchLines 8192 # GPU line-batch size (only used when RunMode=GPU) GPUBatchGrid 256 # GPU grid-batch size (only used when RunMode=GPU) ``` ```bash # Cores and chunks # NCPUtrans 8 NCPUfiles 1 ChunkSize 1000000 RunMode GPU # CPU(default) or GPU GPUBackend AUTO # AUTO(default): PyTorch-CUDA -> CuPy-CUDA -> MPS -> CPU fallback GPUBatchLines 8192 # GPU line-batch size (only used when RunMode=GPU) GPUBatchGrid 256 # GPU grid-batch size (only used when RunMode=GPU) ``` ```bash # CUDA policy (NVIDIA) # Priority: PyTorch-CUDA -> CuPy-CUDA -> MPS -> CPU fallback RunMode GPU GPUBackend CUDA GPUBatchLines 8192 GPUBatchGrid 256 ``` ```bash # Force PyTorch CUDA only RunMode GPU GPUBackend PyTorch-CUDA GPUBatchLines 8192 GPUBatchGrid 256 ``` ```bash # Force CuPy CUDA only RunMode GPU GPUBackend CuPy-CUDA GPUBatchLines 8192 GPUBatchGrid 256 ``` ```bash # Force MPS (Apple Silicon) RunMode GPU GPUBackend MPS GPUBatchLines 8192 GPUBatchGrid 256 ``` ## Quantum numbers Please provide the labels `QNslabel` and formats `QNsformat` of the quantum numbers when you use *PyExoCross* to convert data format, calculate stick spectra or cross sections if you need the quantum filter. * The definition files `.def`, `.def.json`, and `.adef.json` of ExoMol and ExoAtom databases (available at [exomol.com](https://www.exomol.com/)) provides the labels and formats of the quantum numbers for each species for reference. * HITRAN2020 supplementary material ([link](https://hitran.org/media/refs/HITRAN_QN_formats.pdf)) provides the notation and format for quanta identifications for reference. **Note** You can define the quantum number column name by yourself, but please make sure it has letters without any blanks. \ e.g. 'c1', 'c2', 'v1', 'v2', 'electronicState', 'electronic_state', '1v', '2v', 'M/E/C'. \ Wrong format of the quantum number column nams: '1', '2', 'electronic state'. *Example* ```bash # Quantum numbers for conversion, stick spectra and cross sections # QNslabel par e/f eS v Lambda Sigma Omega QNsformat %1s %1s %13s %3d %1d %7.1f %7.1f ```