Official Resources
- Source Repository: https://github.com/yoshida-lab/XenonPy
- Documentation: https://xenonpy.readthedocs.io/
- PyPI: https://pypi.org/project/xenonpy/
- License: Open source (BSD-3)
Overview
XenonPy is a Python library that implements a comprehensive set of machine learning tools for materials informatics. It provides descriptor calculation, model training, transfer learning, and inverse design capabilities for materials discovery from compositional and structural features.
Scientific domain: Materials informatics, ML for materials, descriptor calculation, transfer learning
Target user community: Researchers applying machine learning to materials discovery and property prediction
Theoretical Methods
- Compositional descriptors (elemental property statistics)
- Structural descriptors (radial distribution function, etc.)
- Transfer learning between material datasets
- Neural network models for property prediction
- Inverse design via generative models
- Bayesian optimization for materials screening
Capabilities (CRITICAL)
- 290+ elemental property descriptors
- Compositional and structural feature generation
- Transfer learning framework
- Neural network and gradient boosting models
- Inverse molecular design
- Model interpretability tools
- Cross-validation and model selection
Sources: GitHub repository, ReadTheDocs
Key Strengths
Comprehensive Descriptors:
- 290+ elemental properties
- Compositional descriptors (mean, variance, min, max, etc.)
- Structural descriptors
- Custom descriptor support
Transfer Learning:
- Pre-trained models on large datasets
- Fine-tuning for small target datasets
- Cross-domain transfer
- Improved prediction with limited data
End-to-End Pipeline:
- Feature generation
- Model training
- Prediction
- Inverse design
- Visualization
Inputs & Outputs
-
Input formats:
- Chemical compositions
- Crystal structures (CIF, POSCAR)
- Property datasets (CSV, DataFrame)
-
Output data types:
- Predicted properties
- Trained models
- Descriptors
- Inverse design candidates
Interfaces & Ecosystem
- pymatgen: Structure handling
- pandas: Data management
- PyTorch: Neural network backend
- scikit-learn: ML models
Performance Characteristics
- Speed: Fast (ML inference)
- Accuracy: Dataset dependent
- System size: Any
- Memory: Moderate
Computational Cost
- Descriptor calculation: Seconds
- Model training: Minutes to hours
- Prediction: Milliseconds per sample
- Typical: Efficient
Limitations & Known Constraints
- Data quality dependent: ML is only as good as training data
- Compositional only: Some models don't use structure
- PyTorch dependency: Required for neural networks
- Limited documentation: Could be more extensive
Comparison with Other Codes
- vs matminer: XenonPy has transfer learning, matminer is descriptor-focused
- vs JARVIS-Tools: XenonPy is ML-focused, JARVIS is broader
- vs AMP: XenonPy is property prediction, AMP is potential fitting
- Unique strength: Transfer learning framework for materials with 290+ elemental descriptors
Application Areas
Materials Discovery:
- Property prediction from composition
- High-throughput screening
- Inverse design of molecules
- Materials recommendation
Small Data Regimes:
- Transfer learning from large datasets
- Fine-tuning on limited experimental data
- Cross-domain knowledge transfer
- Active learning
Descriptor Engineering:
- Feature generation for ML
- Custom descriptor development
- Feature importance analysis
- Model interpretability
Best Practices
Data Preparation:
- Use clean, consistent datasets
- Split data properly for validation
- Normalize features appropriately
- Check for data leakage
Model Training:
- Use transfer learning for small datasets
- Cross-validate thoroughly
- Compare multiple model types
- Interpret model predictions
Community and Support
- Open source (BSD-3)
- PyPI installable
- ReadTheDocs documentation
- Developed by Yoshida Lab
- Published in Science and Technology of Advanced Materials
Verification & Sources
Primary sources:
- GitHub: https://github.com/yoshida-lab/XenonPy
- Documentation: https://xenonpy.readthedocs.io/
Confidence: VERIFIED
Verification status: ✅ VERIFIED
- Source code: ACCESSIBLE (GitHub)
- Documentation: ACCESSIBLE (ReadTheDocs)
- PyPI: AVAILABLE
- Specialized strength: Transfer learning framework for materials with 290+ elemental descriptors