{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 6 - Data Retrieval Functions" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from mdf_forge.forge import Forge" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "mdf = Forge()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Data Retrieval" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### globus_download\n", "If you want to access the raw data underlying entries in MDF, you can use `globus_download()` and provide the results from `search()` or `aggregate()`. You can customize how the data files are delivered by specifying a destination path to `dest` (default local directory) and/or setting `preserve_dir=True` if you want to recreate the directory structure of the original data.\n", "\n", "In order to use `globus_download()` to download to your computer, you must be running [Globus Connect Personal](https://www.globus.org/app/endpoints/create-gcp) . If you want to download to a different computer (which must be a Globus Endpoint), you have to specify `dest_ep=ID_of_destination_endpoint`.\n", "\n", "Please note that while _almost_ all data in MDF is accessible through a Globus Endpoint, there may be some entries that are not. A few datasets may be hosted elsewhere and only accessible through HTTP (see `http_download()`) or hosted elsewhere in a custom, non-programmatic configuration." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# NBVAL_SKIP\n", "# Running this example will save a file in the current directory.\n", "res = mdf.search(\"dft.converged:true AND mdf.resource_type:record\", limit=10)\n", "mdf.globus_download(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### http_download\n", "For small data, using Globus is not necessary. You can instead download data using HTTP(S). Except for the endpoint ID, the arguments are the same as `globus_download()`." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "Fetching files: 100%|██████████| 1/1 [00:00<00:00, 12087.33it/s]\n" ] }, { "data": { "text/plain": [ "{'success': True}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NBVAL_SKIP\n", "# Running this example will save a file in the current directory.\n", "res = mdf.search(\"mdf.source_name:oqmd* AND mdf.resource_type:record\", limit=1)\n", "mdf.http_download(res)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### http_stream\n", "If you want to use the data you're downloading directly in your code, you can use `http_stream()` to have the data `yield`-ed to you one entry at a time." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'{\"General\": null, \"Element\": \"Al\", \"Formula\": \"Al\", \"XPS Formula\": \"\", \"Name\": \"aluminum\", \"CAS Registry No\": \"7429-90-5\", \"Classes\": \"element\", \"Citation\": null, \"Author Name(s)\": \"McConville C.F., Seymour D.L., Woodruff D.R., Bao S.\", \"Journal\": \"Surf. Sci. 188, 1 (1987)\", \"Data Processing\": null, \"Data Type\": \"Photoelectron Line\", \"Line Designation\": \"2p3/2\", \"Quality of Data\": \"Adequate\", \"Binding Energy (eV)\": \"72.5\", \"Energy Uncertainty\": \"\", \"Background Subtraction Method\": \"\", \"Peak Location Method\": \"data\", \"Full Width at Half-maximum Intensity (eV)\": \"\", \"Gaussian Width (eV)\": \"\", \"Lorentzian Width (eV)\": \"\", \"Measurement Information\": null, \"Use of X-ray Monochromator\": \"Yes\", \"Excitation Energy\": \"other source\", \"X-ray Energy\": \"100\", \"Overal Energy Resolution (eV)\": \"0.18\", \"Calibration\": \"FL = Fermi level\", \"Charge Reference\": \"Conductor\", \"Energy Scale Evalution\": \"Reliable (reported energy within 300 eV of a reference energy)\", \"Specimen Information\": null, \"Specimen\": \"crystal\", \"Method of Determining Specimen Composition\": \"\", \"Method of Determining Specimen Crystallinity\": \"Low-energy Electron Diffraction\", \"Specimen Temperature (K)\": \"300\", \"Sample Quality\": \"Good\", \"Comment\": null, \"Notes\": \"Al(111).\"}'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# NBVAL_SKIP\n", "res = mdf.search(\"Al\", limit=1)\n", "raw_data = mdf.http_stream(res)\n", "next(raw_data)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }