{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Part 1 - Introduction" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from mdf_forge.forge import Forge # This is the only required import for Forge." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Authentication\n", "Authentication is handled automatically. Just follow the prompt once and let Forge take care of the rest.\n" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "# You can set up Forge with no arguments. Forge will automatically authenticate and connect to MDF.\n", "mdf = Forge()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Basic Queries" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Basic full text search\n", "Using the `search()` method, you can perform a basic text search of the data in MDF.\n", "You will get back a list of matching entries (up to 10,000).\n", "\n", "Let's say we want to find data on aluminum. We can just search for \"Al\" like so:" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "{'files': [{'data_type': 'ASCII text, with very long lines, with no line terminators',\n", " 'filename': 'nist_xps_41530.json',\n", " 'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json',\n", " 'length': 1248,\n", " 'mime_type': 'text/plain',\n", " 'sha512': '69912ca91261bba53dc0df956338baebf81a3f9d1281f4e9108200c3b8473f073ffdff7437a55c8ac3d08d40074a68a5509bbeb1a391426f838427398f3963dd',\n", " 'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json'}],\n", " 'material': {'composition': 'Al', 'elements': ['Al']},\n", " 'mdf': {'ingest_date': '2018-11-06T16:57:59.847843Z',\n", " 'mdf_id': '5be1c83f2ef3883312753d4a',\n", " 'parent_id': '5be1c8172ef388331274efdf',\n", " 'resource_type': 'record',\n", " 'scroll_id': 19819,\n", " 'source_id': 'nist_xps_db_v1',\n", " 'source_name': 'nist_xps_db',\n", " 'version': 1},\n", " 'nist_xps_db': {'binding_energy_ev': '72.5',\n", " 'energy_uncertainty_ev': '',\n", " 'notes': 'Al(111).',\n", " 'temperature_k': '300'}}" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "res = mdf.search(\"Al\")\n", "res[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Advanced-mode searches\n", "You can also query more precisely with the `advanced=True` argument. The basic use is the form `key.subkey:value`. The full documentation for the query syntaz can be found here: http://globus-search-docs.s3-website-us-east-1.amazonaws.com/stable/api/search.html#_query_syntax\n", "\n", "In this example, we can search for \"Al\" inside the \"mdf.elements\" key.\n", "\n", "We're also going to limit the number of results to 10." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "{'cip': {'bv': '79.0',\n", " 'energy': '-3.36',\n", " 'forcefield': 'Al99.eam.alloy',\n", " 'gv': '29.4',\n", " 'mpid': 'mp-134',\n", " 'totenergy': '-107.52'},\n", " 'files': [{'data_type': 'ASCII text, with very long lines, with no line terminators',\n", " 'filename': 'classical_interatomic_potentials.json',\n", " 'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/cip_v1/classical_interatomic_potentials.json',\n", " 'length': 1841203,\n", " 'mime_type': 'text/plain',\n", " 'sha512': '96635ee0c15d1d0187b18805653a02b1a6dfa5648db82153467045de18adcc08c753e2897d2b48a78a2167a442219e9aeff6b1103732c2158facac8fa4911b33',\n", " 'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/cip_v1/classical_interatomic_potentials.json'}],\n", " 'material': {'composition': 'Al32', 'elements': ['Al']},\n", " 'mdf': {'ingest_date': '2018-10-29T17:47:57.468388Z',\n", " 'mdf_id': '5bd747cf2ef3880b0f213904',\n", " 'parent_id': '5bd747cd2ef3880b0f2135d1',\n", " 'resource_type': 'record',\n", " 'scroll_id': 819,\n", " 'source_id': 'cip_v1',\n", " 'source_name': 'cip',\n", " 'version': 1}}" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "res = mdf.search(\"material.elements:Al\", advanced=True, limit=10)\n", "res[0]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you want to search on a value with special characters, such as a colon or space, you must wrap the value in double quotes. Otherwise, you may get unexpected results." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/plain": [ "{'data': {'endpoint_path': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/ab_initio_solute_database_v1-2/',\n", " 'link': 'https://www.globus.org/app/transfer?origin_id=e38ee745-6d04-11e5-ba46-22000b92c6ec&origin_path=/MDF/mdf_connect/prod/data/ab_initio_solute_database_v1-2/'},\n", " 'dc': {'contributors': [{'affiliations': ['University of Wisconsin-Madison'],\n", " 'contributorName': 'Morgan, Dane',\n", " 'contributorType': 'ContactPerson',\n", " 'familyName': 'Morgan',\n", " 'givenName': 'Dane'}],\n", " 'creators': [{'affiliations': ['University of Wisconsin-Madison'],\n", " 'creatorName': 'Morgan, Dane',\n", " 'familyName': 'Morgan',\n", " 'givenName': 'Dane'},\n", " {'affiliations': ['University of Wisconsin-Madison'],\n", " 'creatorName': 'Mayeshiba, Tam',\n", " 'familyName': 'Mayeshiba',\n", " 'givenName': 'Tam'},\n", " {'affiliations': ['University of Wisconsin-Madison'],\n", " 'creatorName': 'Henry, Wu',\n", " 'familyName': 'Henry',\n", " 'givenName': 'Wu'}],\n", " 'dates': [{'date': '2017-08-07T16:07:32.938812Z', 'dateType': 'Collected'}],\n", " 'descriptions': [{'description': 'We demonstrate automated generation of diffusion databases from high-throughput density functional theory (DFT) calculations. A total of more than 230 dilute solute diffusion systems in Mg, Al, Cu, Ni, Pd, and Pt host lattices have been determined using multi-frequency diffusion models. We apply a correction method for solute diffusion in alloys using experimental and simulated values of host self-diffusivity.',\n", " 'descriptionType': 'Other'}],\n", " 'publicationYear': '2016',\n", " 'publisher': 'MDF (placeholder)',\n", " 'relatedIdentifiers': [{'relatedIdentifier': 'http://dx.doi.org/10.1038/sdata.2016.54',\n", " 'relatedIdentifierType': 'DOI',\n", " 'relationType': 'IsPartOf'}],\n", " 'resourceType': {'resourceType': 'JSON', 'resourceTypeGeneral': 'Dataset'},\n", " 'subjects': [{'subject': 'dilute'},\n", " {'subject': 'solute'},\n", " {'subject': 'DFT'},\n", " {'subject': 'diffusion'},\n", " {'subject': 'dataset'}],\n", " 'titles': [{'title': 'High-throughput Ab-initio Dilute Solute Diffusion Database'}]},\n", " 'mdf': {'ingest_date': '2018-11-24T08:12:11.852893Z',\n", " 'mdf_id': '5bf907db2ef3885ee1191ae0',\n", " 'resource_type': 'dataset',\n", " 'scroll_id': 0,\n", " 'source_id': 'ab_initio_solute_database_v1-2',\n", " 'source_name': 'ab_initio_solute_database',\n", " 'version': 1},\n", " 'services': {'mdf_search': 'This dataset was ingested to MDF Search.'}}" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "res = mdf.search('dc.titles.title:\"High-throughput Ab-initio Dilute Solute Diffusion Database\"', advanced=True)\n", "res[0]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.4" } }, "nbformat": 4, "nbformat_minor": 2 }