On episode 13, we recap the events of the Data Array Summit that took place last weekend at Enthought HQ in Austin, TX. The summit was a chance for NumPy developers from the community to meet face-to-face and talk about the ‘labeled array’ or ‘data array’ concept that seems to crop up just about everywhere you look.
The current state of affairs in python scientific and statistical computing are compared to other languages, like R. The major topic of discussion is the API to give numpy arrays increased functionality in a sufficiently pythonic syntax and , of course, while retaining performance.
Today’s hosts include:
- Wes McKinney (special guest)
- Travis Oliphant (special guest)
- Fernando Perez (special guest)
- Anthony Scopatz (moderator)
Wes McKinney is a PhD student in Statistical Science at Duke University, focusing on Bayesian methods for time series and other dynamic processes. After undergraduate, he worked for three years at AQR, a quantitative hedge fund, where he developed many research and production systems in Python. Part of his work at AQR was released as the open source project pandas, which he continues to actively develop. He is dedicated to building tools to enhance the use of Python for statistical computing applications, especially those relating to time series and financial applications. Outside of his academic work in statistics, he also does Python consulting work in the financial industry.
Travis Oliphant received the PhD from Mayo Graduate School in Biomedical Engineering and taught Electrical Engineering at Brigham Young University for 6 years before devoting himself full-time to developing scientifically-related software and managing customer relationships at Enthought. He is one of the original authors of SciPy and a major NumPy contributor and enjoys reading about neuroscience.
Fernando Perez is a research scientist working on the development of algorithms and computational tools for neuroscience at the at the University of California, Berkeley. After a PhD in particle physics and a postdoc in applied mathematics developing numerical algorithms, he currently works at the interface between high-level scientific computing tools in Python and the mathematical questions that arise in the analysis of neuroimaging data. He started the IPython project in 2001 and continues to lead it, now as a collaborative effort with a talented team that does all the hard work. He regularly lectures about scientific computing in Python.
Intro/Outro Music: ‘The Fear’ -Lilly Allen
- Data Array Summit Convore Thread
- datarray Project
- pandas: Python Data Analysis Library
- larray: Labeled Array Project
- NumPy Structured Arrays
- Fernando’s post about the summit
Our misguided host mistakenly declared Fernando as ‘Research Assistant’ when in fact he is a ‘Research Scientist.’ All apologies!