Part 1 - Introduction¶
[1]:
from mdf_forge.forge import Forge # This is the only required import for Forge.
Authentication¶
Authentication is handled automatically. Just follow the prompt once and let Forge take care of the rest.
[2]:
# You can set up Forge with no arguments. Forge will automatically authenticate and connect to MDF.
mdf = Forge()
Basic Queries¶
Basic full text search¶
Using the search()
method, you can perform a basic text search of the data in MDF. You will get back a list of matching entries (up to 10,000).
Let’s say we want to find data on aluminum. We can just search for “Al” like so:
[3]:
res = mdf.search("Al")
res[0]
[3]:
{'files': [{'data_type': 'ASCII text, with very long lines, with no line terminators',
'filename': 'nist_xps_41530.json',
'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json',
'length': 1248,
'mime_type': 'text/plain',
'sha512': '69912ca91261bba53dc0df956338baebf81a3f9d1281f4e9108200c3b8473f073ffdff7437a55c8ac3d08d40074a68a5509bbeb1a391426f838427398f3963dd',
'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/nist_xps_db_v1/nist_xps_41530.json'}],
'material': {'composition': 'Al', 'elements': ['Al']},
'mdf': {'ingest_date': '2018-11-06T16:57:59.847843Z',
'mdf_id': '5be1c83f2ef3883312753d4a',
'parent_id': '5be1c8172ef388331274efdf',
'resource_type': 'record',
'scroll_id': 19819,
'source_id': 'nist_xps_db_v1',
'source_name': 'nist_xps_db',
'version': 1},
'nist_xps_db': {'binding_energy_ev': '72.5',
'energy_uncertainty_ev': '',
'notes': 'Al(111).',
'temperature_k': '300'}}
Advanced-mode searches¶
You can also query more precisely with the advanced=True
argument. The basic use is the form key.subkey:value
. The full documentation for the query syntaz can be found here: http://globus-search-docs.s3-website-us-east-1.amazonaws.com/stable/api/search.html#_query_syntax
In this example, we can search for “Al” inside the “mdf.elements” key.
We’re also going to limit the number of results to 10.
[4]:
res = mdf.search("material.elements:Al", advanced=True, limit=10)
res[0]
[4]:
{'cip': {'bv': '79.0',
'energy': '-3.36',
'forcefield': 'Al99.eam.alloy',
'gv': '29.4',
'mpid': 'mp-134',
'totenergy': '-107.52'},
'files': [{'data_type': 'ASCII text, with very long lines, with no line terminators',
'filename': 'classical_interatomic_potentials.json',
'globus': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/cip_v1/classical_interatomic_potentials.json',
'length': 1841203,
'mime_type': 'text/plain',
'sha512': '96635ee0c15d1d0187b18805653a02b1a6dfa5648db82153467045de18adcc08c753e2897d2b48a78a2167a442219e9aeff6b1103732c2158facac8fa4911b33',
'url': 'https://e38ee745-6d04-11e5-ba46-22000b92c6ec.e.globus.org/MDF/mdf_connect/prod/data/cip_v1/classical_interatomic_potentials.json'}],
'material': {'composition': 'Al32', 'elements': ['Al']},
'mdf': {'ingest_date': '2018-10-29T17:47:57.468388Z',
'mdf_id': '5bd747cf2ef3880b0f213904',
'parent_id': '5bd747cd2ef3880b0f2135d1',
'resource_type': 'record',
'scroll_id': 819,
'source_id': 'cip_v1',
'source_name': 'cip',
'version': 1}}
If you want to search on a value with special characters, such as a colon or space, you must wrap the value in double quotes. Otherwise, you may get unexpected results.
[5]:
res = mdf.search('dc.titles.title:"High-throughput Ab-initio Dilute Solute Diffusion Database"', advanced=True)
res[0]
[5]:
{'data': {'endpoint_path': 'globus://e38ee745-6d04-11e5-ba46-22000b92c6ec/MDF/mdf_connect/prod/data/ab_initio_solute_database_v1-2/',
'link': 'https://www.globus.org/app/transfer?origin_id=e38ee745-6d04-11e5-ba46-22000b92c6ec&origin_path=/MDF/mdf_connect/prod/data/ab_initio_solute_database_v1-2/'},
'dc': {'contributors': [{'affiliations': ['University of Wisconsin-Madison'],
'contributorName': 'Morgan, Dane',
'contributorType': 'ContactPerson',
'familyName': 'Morgan',
'givenName': 'Dane'}],
'creators': [{'affiliations': ['University of Wisconsin-Madison'],
'creatorName': 'Morgan, Dane',
'familyName': 'Morgan',
'givenName': 'Dane'},
{'affiliations': ['University of Wisconsin-Madison'],
'creatorName': 'Mayeshiba, Tam',
'familyName': 'Mayeshiba',
'givenName': 'Tam'},
{'affiliations': ['University of Wisconsin-Madison'],
'creatorName': 'Henry, Wu',
'familyName': 'Henry',
'givenName': 'Wu'}],
'dates': [{'date': '2017-08-07T16:07:32.938812Z', 'dateType': 'Collected'}],
'descriptions': [{'description': 'We demonstrate automated generation of diffusion databases from high-throughput density functional theory (DFT) calculations. A total of more than 230 dilute solute diffusion systems in Mg, Al, Cu, Ni, Pd, and Pt host lattices have been determined using multi-frequency diffusion models. We apply a correction method for solute diffusion in alloys using experimental and simulated values of host self-diffusivity.',
'descriptionType': 'Other'}],
'publicationYear': '2016',
'publisher': 'MDF (placeholder)',
'relatedIdentifiers': [{'relatedIdentifier': 'http://dx.doi.org/10.1038/sdata.2016.54',
'relatedIdentifierType': 'DOI',
'relationType': 'IsPartOf'}],
'resourceType': {'resourceType': 'JSON', 'resourceTypeGeneral': 'Dataset'},
'subjects': [{'subject': 'dilute'},
{'subject': 'solute'},
{'subject': 'DFT'},
{'subject': 'diffusion'},
{'subject': 'dataset'}],
'titles': [{'title': 'High-throughput Ab-initio Dilute Solute Diffusion Database'}]},
'mdf': {'ingest_date': '2018-11-24T08:12:11.852893Z',
'mdf_id': '5bf907db2ef3885ee1191ae0',
'resource_type': 'dataset',
'scroll_id': 0,
'source_id': 'ab_initio_solute_database_v1-2',
'source_name': 'ab_initio_solute_database',
'version': 1},
'services': {'mdf_search': 'This dataset was ingested to MDF Search.'}}
[ ]: