Python Data Science community
You might also like : Data Science Python R

JupyterLab is Ready for Users – Jupyter Blog 2317 retweets

We are proud to announce the beta release series of JupyterLab, the next-generation web-based interface for Project Jupyter. Project Jupyter exists to develop open-source software, open standards…

Data science is different now · Vicki Boykis 930 retweets

New blog post: For the past couple years, I've been telling people who ask me for advice not to go into data science. Here's why: The data science job market is way oversaturated. Here's what they should do instead.

The Incredible Growth of Python | Stack Overflow 546 retweets

When we focus on high-income countries, the growth of Python is even larger than it might appear from tools like Stack Overflow Trends.

Code Golf in Python: Sudoku 534 retweets

def S(p):i=p.find('0');return[(s for v in set(str(5**18))-{(i-j)%9*(i/9^j/9)*(i/27^j/27|i%9/3^j%9/3)or p[j]for j in range(81)}for s in S(p[:i]+v+p[i+1:])),[p]][i<0] That solves any sudoku in 165 characters of Python. I always wished I could tweet...

xkcd: Python Environment 480 retweets

xkcd: Python Environment

functools — Higher-order functions and operations on callable objects ... 439 retweets

One of my favorite Python 3 builtins is functools.lru_cache(): with a simple decorator, repeated function calls become O(1) table lookups.

Alice in Python projectland · Vicki Boykis 370 retweets

I couldn't find a comprehensive guide for how to go from Python scripts to a packaged project, so I wrote one. 🐍

A fast, extensible progress bar for Python and CLI 341 retweets

Wanna see progress of a long running operation easily in your Jupyter notebook? Use the wonderful tqdm module - As a bonus, the name is Arabic & Spanish inspired!

Composable transformations of PythonNumPy programs: differentiate, ve 337 retweets

GitHub - google/jax: Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Altair: Declarative Visualization in Python — Altair 300 documentatio 309 retweets

Excited to announce the 2.0 release of Altair, built on over 800 new commits from 18 contributors. Huge thanks to everyone involved!

Seven Strategies for Optimizing Numerical Code 302 retweets

Talk given at PyCon 2018 Abstract: Python provides a powerful platform for working with data, but often the most straightforward data analysis can be painfully slow. When used effectively, though, Python can be as fast as even compiled languages ...

Teaching and Learning with Jupyter 297 retweets

The Handbook for Teaching and Learning with Jupyter—first draft is out! Jupyter4Edu

A foundation for scikit-learn at Inria -- Gaël Varoquaux: computer / d... 292 retweets

Announcing a foundation for scikit-learn at Inria: This legal structure will enable the team in France to receive private funding, which we will use to support the community, ensure the quality of the library, and tackle ambitious features pydata ...

Pretty dir() printing with joy 291 retweets

pdir() vs dir() in Python. I think I'm in love. Thanks brianokken for the heads up: Covered today on pythonbytes

Triple Pendulum CHAOS! | Pythonic Perambulations 288 retweets

New post: Triple Pendulum CHAOS! A Python reproduction of that triple pendulum animation going around.

Google Colaboratory 280 retweets

Checkit out: google has soft-launched their hosted Jupyter-like notebook!

Practical Data Science for Stats 260 retweets

Collection of preprints focusing on the practical side of data science workflows and statistical analysis. Curated by Jennifer Bryan and Hadley Wickham.

The Unexpected Effectiveness of Python in Science 239 retweets

PyCon 2017 opening keynote; see the video here:

Scipy Lecture Notes — Scipy lecture notes 233 retweets

Scipy lecture notes now available under A book to learn to master the Python numerical stack Secure and faster (benefits from CDN). Thanks to pdebuyl and westurner

Real numbers, data science and chaos: How to fit any dataset with a si... 228 retweets

✨👩‍💻📰 Someone give PapersWithCode every single prize. Displays every academic paper with code that has been open-sourced, ordered by Github stars accumulated in the last three days. ⭐️ There are even auto-populated tags for conferences and data...

Python Data Science Handbook - O'Reilly Media 219 retweets

For many researchers, Python is a first-class tool mainly because of its libraries for storing, manipulating, and gaining insight from data. Several resources exist for individual pieces of this data science stack, but only with the Python Data...

Python Data Science Handbook: full text in Jupyter Notebooks 207 retweets

GitHub - jakevdp/PythonDataScienceHandbook: Python Data Science Handbook: full text in Jupyter Notebooks

Installing Python Packages from a Jupyter Notebook 202 retweets

%pip and %conda magic commands have just been merged into IPython! 🎉🎉🎉 This will go a long way toward alleviating user confusion around package installation in ProjectJupyter that I wrote about last year:

Dask Version 10 198 retweets

Dask Version 1.0

Open in Colab - Chrome Web Store 197 retweets

Open a live, executable view of any ProjectJupyter notebook from GitHub in one click with the Open In GoogleColab Chrome extension:

Hierarchical Bayesian Neural Networks with Informative Priors 197 retweets

New post: Hierarchical Bayesian Neural Networks with Informative Priors deeplearning bayesian pydata pymc3

Weird matplotlib bug: points disappear unless I look directly at them.... 195 retweets

Weird matplotlib bug: points disappear unless I look directly at them.

Python Courses & Screencasts – Real Python 193 retweets

🐍🚀 After months of hard work and building out our brand new video team, I'm thrilled to announce that more than 120 📺 Python video lessons 🎥 have just landed on the Real Python website:

The problem with the data science language wars 182 retweets

I really enjoyed the cheeky blog post by my pal Rob Story. Like many other data tool creators, I've been annoyed by the assorted "Python vs R" click-bait articles and Hacker News posts by folks who in all likelihood might not survive an interview pan...

Visualizer for deep learning and machine learning models 180 retweets

🧠 Have you seen Netron? It's a viewer for neural network, deep learning, and machine learning models, supporting TFLite, TensorFlow, TensorFlow.js, MXNet, CoreML, PyTorch, scikit-learn, and more! Extremely cool project, LutzRoeder. 👍

A Diagram Editor for JupyterLab – Jupyter Blog 179 retweets

The new JupyterLab interface is much more than a replacement for the classic notebook. It aims to bring together all the pieces required for a complete scientific workflow. The extension-based…

Teach the tidyverse to beginners 179 retweets

A few years ago, I wrote a post Don’t teach built-in plotting to beginners (teach ggplot2). I argued that ggplot2 was not an advanced approach meant for experts, but rather a suitable introduction to data visualization.

Prioritize Which Data Skills Your Company Needs with This 2×2 Matrix 174 retweets

Everyone, it appears we have finally found it, the truly magical company where no data cleaning, statistics, or data warehousing need to ever occur.

Journal of Computational and Graphical Statistics: Vol 26, No 4 167 retweets

Donoho’s 50 Years of Data Science *and all the discussion pieces* are now open access = no more paywall! Thoughtful writing from many on the front lines re: teaching & doing Data Science inside Statistics.

Python's Visualization Landscape (PyCon 2017) 167 retweets

So you want to visualize some data in Python: which library do you choose? From Matplotlib to Seaborn to Bokeh to Plotly, Python has a range of mature tools to create beautiful visualizations, each with their own strengths and weaknesses. In this tal...

Installing Python Packages from a Jupyter Notebook 164 retweets

New post: Installing Python Packages from a Jupyter Notebook It should be easy, but it isn't... here's why

Simulating Chutes & Ladders in Python 164 retweets

Simulating Chutes & Ladders in Python | Pythonic Perambulations

Welcome to PyAutoGUI’s documentation! — PyAutoGUI 100 documentation 163 retweets

Just learned about the pyautogui library: automate mouse clicks & other interactions with Python!

Probabilistic Programming in Python: Bayesian Modeling and Probabilist... 162 retweets

Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano - pymc-devs/pymc3

Visualize Python code execution (line-by-line) in Jupyter Notebook ce... 162 retweets

see what happens when you run a ProjectJupyter notebook cell line by line. This looks very cool!

Interactive Workflows for C with Jupyter 158 retweets

Scientists, educators and engineers not only use programming languages to build software systems, but also in interactive workflows, using the tools available to explore a problem and reason about…

Run C in the Jupyter notebook now! With the cling interpreter and 152 retweets

Run C++ in the Jupyter notebook now! With the cling interpreter and mybinder: thefreemanlab fperez_org minrk

51 Pipelines and composite estimators — scikit-learn 022dev0 docume 151 retweets

Big deal for heterogeneous data in scikit-learn (columnar data, pandas DataFrame, CSV files): ColumnTransformer just merged in development version: Thanks to jorisvdbossche, amuellerml, Joel Nothman!

Hands-on Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2n... 150 retweets

Through a series of recent breakthroughs, deep learning has boosted the entire field of machine learning. Now, even programmers who know close to nothing about this technology can use simple ... - Selection from Hands-on Machine Learning with Scikit...

Exploring Line Lengths in Python Packages 147 retweets

New post: some data-driven ruminations on PEP8's 79 character limit & the mark it leaves on Python packages

A series of Jupyter notebooks that walk you through the fundamentals ... 145 retweets

Just pushed a notebook that covers loading and preprocessing data efficiently using TensorFlow 2. It includes code examples for the data API, TFRecords & the Features API (+ short examples of TF Transform, TF datasets and TF Hub). Enjoy! 👉 Noteb...

A high-level app and dashboarding solution for Python — Panel 051 doc 143 retweets

Fantastic project integrating all Python visualization tools into easily composable dashboard -- . This is awesome work from the PyViz team anacondainc.

Theano, TensorFlow and the Future of PyMC 143 retweets

PyMC3 is an open-source library for Bayesian statistical modeling and inference in Python, implementing gradient-based Markov chain Monte Carlo, variational inference, and other approximation…

GitHub Jupyter Notebooks = <3 141 retweets

Communicating ideas that combine code, data and visualizations can be hard, especially if you’re trying to collaborate in realtime with your colleagues. Whether you’re a researcher studying Wikipedia, an astronomer investigating the movements of gala...

The Big Data Brain Drain: Why Science is in Trouble 138 retweets

The Big Data Brain Drain: Why Science is in Trouble | Pythonic Perambulations

Notebooks and code for the book "Introduction to Machine Learning wit... 137 retweets

GitHub - amueller/introduction_to_ml_with_python: Notebooks and code for the book "Introduction to Machine Learning with Python"

UMAP: Uniform Manifold Approximation and Projection for Dimension Redu... 135 retweets

Having fun this morning exploring the umap Python package – fast & scalable dimensionality reduction for visualization of high-dimensional datasets (with a scikit-learn style API)

Data Science and Linear Algebra Fundamentals with Python, SciPy, & Num... 133 retweets

Learn to perform data science and linear algebra fundamentals using Python, Scipy, & NumPy.

Statistics for Hackers - Speaker Deck 132 retweets

(Presented at PyCon 2016. Early version presented at StitchFix, Sept 2015. See the PyCon video at The field of statistics has a reputation for being difficult to crack: it revolves around a seemingly en...

Introducing makeitpop, a tool to perceptually warp your data! 123 retweets

Note: It should go without saying, but you should never do the stuff that you’re about to read about here. Data is meant to speak for itself, and our visualizations should accurately reflect the data above all else.

Conda: Myths and Misconceptions 120 retweets

A new (long-ish) blog post — Conda: Myths and Misconceptions

Python's Data Science Stack (JSM 2016) 119 retweets

The Python language was not originally designed with scientific computing in mind, but its beauty and ease-of-use have inspired the development of a powerful and mature ecosystem of scientific and data-focused computing tools. This talk will give a b...

Finding a prime number whose binary representation is a giraffe (or a ... 116 retweets

Finding a prime number whose binary representation is a T-Rex... Python script included!

Beyond Interactive: Notebook Innovation at Netflix 115 retweets

At Netflix, we're reimagining what a Jupyter notebook can be, who can use it, and what they can do with it. And we're investing big to make this vision reality.

GPU Dask Arrays, first steps throwing Dask and CuPy together 114 retweets

Parallel GPU Arrays with Dask and CuPy. A blogpost on first steps.

Introduction · Computational and Inferential Thinking 112 retweets

Foundations of Data Science is fastest growing class in UC Berkeley history. Here's the book JupyterCon fperez_org

Jupyter/IPython Kernel Tools 109 retweets

Your favorite language not supported in Jupyter Notebooks yet? You can write a kernel to support it more easily than ever, with the Metakernel library - There's already a lot of kernels listed there - go make yours!

Jupyter notebooks as Markdown documents, Julia, Python or R scripts 108 retweets

Wanna use Jupyter Notebooks but store them as plain .py, .r or .jl files? Check out the amazing

Python for Data Analysis, 2nd Edition 103 retweets

Get complete instructions for manipulating, processing, cleaning, and crunching datasets in Python. Updated for Python 3.6, the second edition of this hands-on guide is packed with practical case stu...

datas-frame – Modern Pandas (Part 8): Scaling 103 retweets

Part 8: Scaling pandas with Dask: A belated update to my series on using pandas. This post covers scaling pandas to larger datasets using dask_dev

Similarity encoding for learning with dirty categorical variables 101 retweets

For statistical learning, categorical variables in a table are usually considered as discrete entities and encoded separately to feature vectors, e.g., with one-hot encoding. "Dirty" non-curated data gives rise to categorical variables with a very hi...

Beyond Numpy Arrays in Python Preparing the ecosystem for GPU, distrib... 101 retweets

New Blogpost: Beyond Numpy arrays in Python Preparing the ecosystem for sparse, distributed, and GPU Numpy-style arrays.

Building a desktop notification tool for Linux using python 101 retweets

Building a desktop notification tool for Linux using python | Codementor

A Dramatic Tour through Python’s Data Visualization Landscape (includi... 99 retweets

Why Even Try, Man? I recently came upon Brian Granger and Jake VanderPlas's Altair, a promising young visualization library. Altair seems well-suited to addressing Python's ggplot envy, and its tie-in with JavaScript's Vega-Lite grammar means that as...

Notebooks for the Altair tutorial 99 retweets

Notebooks for the Altair tutorial. Contribute to altair-viz/altair-tutorial development by creating an account on GitHub.