Part 6 - Data Retrieval Functions

[1]:
from mdf_forge.forge import Forge
[2]:
mdf = Forge()

Data Retrieval

globus_download

If you want to access the raw data underlying entries in MDF, you can use globus_download() and provide the results from search() or aggregate(). You can customize how the data files are delivered by specifying a destination path to dest (default local directory) and/or setting preserve_dir=True if you want to recreate the directory structure of the original data.

In order to use globus_download() to download to your computer, you must be running Globus Connect Personal . If you want to download to a different computer (which must be a Globus Endpoint), you have to specify dest_ep=ID_of_destination_endpoint.

Please note that while almost all data in MDF is accessible through a Globus Endpoint, there may be some entries that are not. A few datasets may be hosted elsewhere and only accessible through HTTP (see http_download()) or hosted elsewhere in a custom, non-programmatic configuration.

[ ]:
# NBVAL_SKIP
# Running this example will save a file in the current directory.
res = mdf.search("dft.converged:true AND mdf.resource_type:record", limit=10)
mdf.globus_download(res)

http_download

For small data, using Globus is not necessary. You can instead download data using HTTP(S). Except for the endpoint ID, the arguments are the same as globus_download().

[4]:
# NBVAL_SKIP
# Running this example will save a file in the current directory.
res = mdf.search("mdf.source_name:oqmd* AND mdf.resource_type:record", limit=1)
mdf.http_download(res)
Fetching files: 100%|██████████| 1/1 [00:00<00:00, 12087.33it/s]
[4]:
{'success': True}

http_stream

If you want to use the data you’re downloading directly in your code, you can use http_stream() to have the data yield-ed to you one entry at a time.

[5]:
# NBVAL_SKIP
res = mdf.search("Al", limit=1)
raw_data = mdf.http_stream(res)
next(raw_data)
[5]:
'{"General": null, "Element": "Al", "Formula": "Al", "XPS Formula": "", "Name": "aluminum", "CAS Registry No": "7429-90-5", "Classes": "element", "Citation": null, "Author Name(s)": "McConville C.F., Seymour D.L., Woodruff D.R., Bao S.", "Journal": "Surf. Sci. 188, 1 (1987)", "Data Processing": null, "Data Type": "Photoelectron Line", "Line Designation": "2p3/2", "Quality of Data": "Adequate", "Binding Energy (eV)": "72.5", "Energy Uncertainty": "", "Background Subtraction Method": "", "Peak Location Method": "data", "Full Width at Half-maximum Intensity (eV)": "", "Gaussian Width (eV)": "", "Lorentzian Width (eV)": "", "Measurement Information": null, "Use of X-ray Monochromator": "Yes", "Excitation Energy": "other source", "X-ray Energy": "100", "Overal Energy Resolution (eV)": "0.18", "Calibration": "FL       = Fermi level", "Charge Reference": "Conductor", "Energy Scale Evalution": "Reliable (reported energy within 300 eV of a reference energy)", "Specimen Information": null, "Specimen": "crystal", "Method of Determining Specimen Composition": "", "Method of Determining Specimen Crystallinity": "Low-energy Electron Diffraction", "Specimen Temperature (K)": "300", "Sample Quality": "Good", "Comment": null, "Notes": "Al(111)."}'
[ ]: