A software environment for statistical analysis, molecular viewing, descriptor generation, and similarity search.
Jack Liu, Jun Feng, Atina Brooks and Stan Young
National Institute of Statistical Sciences
Basic Functions:
- Supports MDL SDF format
- Displays molecules in multiple columns.
- Displays properties contained in SD file in a table.
- Anti-alias technology for best picture quality.
- Table of molecule pictures and properties can be exported to Excel (Office XP and above) to generate personalized reports.
- Calculates three types of binary atom pair descriptors and continuous weighed burden numbers.
- Searches over ACL library to determine possible mechanisms or side effects. The user can create and load their personal databases.
- Calculates Drug-like properties like LogP, PSA, MW, HBAs, HBDs, etc.
- Builds regression model using Least Angle Regression (LARS) and LASSO-2
- Builds regression and classification model using Random Forest through graphical interface to R.
- Cluster analysis with KMeans through graphical interface to R.
- Outlier detection using tetrads method (Douglas Hawkins, et al). (Code implemented by Andrew Wong).
- Novel robust single value decomposition (RSVD) for large datasets with missing values or outliers.
BASIC VERSION
Download now
Version 0.61 released! 02/03/2005
Notes: Users from Denmark and some other European countries should change Regional Setting to U.S. to avoid a file saving bug.
AFFILIATE VERSION (requires Affiliate userid and password)
Version 0.71 released! 06/08/2006
Become a NISS Affliate and get our latest version with better graphics, better descriptors, and substructure searching functions.
DATA SET:
Data set of 317 compounds in 21 biological classes from Xue 2002
Prerequisites:
1. Microsoft .net 1.1 and above (required)
2. R 2.3.1 (required for RandomForest and KMean. You need to install randomForest package within R after R installation.)
3. DirectX 9.0c Runtime (optional for 3D viewing)