Category: Data Scientists
Supervised domain-agnostic prediction framework for probabilistic modelling
A supervised domain-agnostic framework that allows for probabilistic modelling, namely the prediction of probability distributions for individual data points. The package offers a variety of features and specifically allows for the implementation of probabilistic prediction strategies in the supervised contexts comparison of frequentist and Bayesian prediction methods strategy optimization through hyperparamter tuning and ensemble methods (e.g. bagging) workflow automation List of developers and contributors Documentation The full documentation is available here. Installation Installation is easy using Python’s package manager $ […]
Read moreA data science tool that captures and stores model training and execution information
Purpose Rubicon is a data science tool that captures and stores model training and execution information, like parameters and outcomes, in a repeatable and searchable way. Rubicon’s git integration associates these inputs and outputs directly with the model code that produced them to ensure full auditability and reproducibility for both developers and stakeholders alike. While experimenting, the Rubicon dashboard makes it easy to explore, filter, visualize, and share recorded work. Components Rubicon is composed of three parts: A Python library […]
Read moreAn open-source package for creating experiments in behavioral science
PsychoPy PsychoPy is an open-source package for creating experiments in behavioral science. It aims to provide a single package that is: precise enough for psychophysics easy enough for teaching flexible enough for everything else able to run experiments in a local Python script or online in JavaScript To meet these goals PsychoPy provides a choice of interface – you can use asimple graphical user interface called Builder, or write your experiments inPython code. The entire application and library are written […]
Read moreA Python Package for Stochastic Nonparametric Envelopment of Data
pyStoNED pyStoNED is a Python package that provides functions for estimating Convex Nonparametric Least Square (CNLS), Stochastic Nonparametric Envelopment of Data (StoNED), and other various StoNED-related variants such as Convex Quantile Regression (CQR), Convex Expectile Regression (CER), and Isotonic CNLS (ICNLS). It also provides efficiency measurement using Data Envelopement Analysis (DEA) and Free Disposal Hull (FDH). The pyStoNED package allows the user to estimate the CNLS/StoNED frontiers in an open-access environment and is built based on the Pyomo. The pyStoNED […]
Read moreQuickly download, clean up, and install public datasets into a database management system
retriever Finding data is one thing. Getting it ready for analysis is another. Acquiring, cleaning, standardizing and importing publicly available data is time consuming because many datasets lack machine readable metadata and do not conform to established data structures and formats. The Data Retriever automates the first steps in the data analysis pipeline by downloading, cleaning, and standardizing datasets, and importing them into relational databases, flat files, or programming languages. The automation of this process reduces the time for a […]
Read moreA Python package for plasma science that is under development
PlasmaPy PlasmaPy is an open source, community-developed Python 3.7+ package for plasma science. PlasmaPy intends to be for plasma science what Astropy is for astronomy — a collection of functionality commonly used and shared between plasma scientists and researchers globally, running within and leveraging the open source scientific Python ecosystem. The goals of this project are more thoroughly described in this recent video. Current functionality is described in PlasmaPy’s online documentation. Installation If you have installed Python, you can install […]
Read more