Pandas Transform

Uma das características, entre outras, do pandas é que ele possui uma vasta biblioteca de métodos para manipular dados. No entanto, há momentos em que não fica claro quais das várias funções devemos usar e como usá-las. Se você está migrando do Excel, pode ser difícil traduzir as soluções utilizando funções desconhecidas através dos comandos do pandas. Uma dessas funções “desconhecidas” é o método transform. Mesmo depois de usar pandas por um tempo, nunca tive a chance de usar essa função, então, recentemente, tirei algum tempo para descobrir o que ela é e como isso pode ser útil para a análise do mundo real. Este artigo irá usar um exemplo em que ele pode ser usado para resumir dados de forma eficiente.

Leia mais, visite o original:

Este mês na DataCamp – Fevereiro 2017

Fonte: [Mail DataCamp] (


. DataCamp .

This Month at DataCamp

An update from your favorite online data science education company 

New Statistics Curriculum!

New Stats Curriculum

Our entirely new curriculum takes a modern approach to teaching statistics using simulation and randomization rather than the traditional theoretical approach. Courses include Introduction to Data, Exploratory Data Analysis, Correlation and Regression, and Foundations of Inference.

New Python Courses

Data Visualization in Python

This course builds data visualization skills including customizing graphics, plotting two-dimensional arrays, creating statistical graphics, and more. Play now!

pandas Foundations

Learn to use the industry-standard pandas library to import, build, and manipulate DataFrames in Python. Play now!

Manipulating DataFrames with pandas

Learn to extract, filter, and transform data in order to drill into the data that really matters. Play now!

New R Courses

Exploratory Data Analysis in R: Case Study

Once you’re familiar with tools for data manipulation and visualization, this course gives you a chance to put them in action. Play now!

Object-Oriented Programming in R: S3 & R6

Object-oriented programming (OOP) lets you specify relationships between functions and objects and helps manage complexity in your code. Play now!

Introduction to Time Series Analysis

In this course, you will be introduced to core time series analysis concepts and techniques. Play now!

Bond Valuation & Analysis in R

In this course, you will learn how (and why) to use R to analyze and value bonds. Play now!

ARIMA Modeling with R

In this course, you will become an expert at fitting ARIMA models to time series data. Play now!

Python + Excel

New Tutorial: Using Python and Excel for Data Science

Excel is so popular, it is inevitable that you’ll need to deal with Excel files at some point. While Excel is accessible, you may also want to leverage the power and flexibility of Python. In this tutorial, learn some of the ways to read, write and manipulate data in Excel spreadsheets using Python.

DataCamp Inc.

2067 Massachusetts avenue

Cambridge MA 02141


Introdução ao Python científico com pandas

Fonte: Extraído do original


Scientific python pandas

An Introduction to Scientific Python – Pandas

Pandas has got to be one of my most favourite libraries… Ever.
Pandas allows us to deal with data in a way that us humans can understand it; with labelled columns and indexes. It allows us to effortlessly import data from files such as csvs, allows us to quickly apply complex transformations and filters to our data and much more. It’s absolutely brilliant.

Along with Numpy and Matplotlib I feel it helps create a really strong base for data exploration and analysis in Python. Scipy (which will be covered in the next post), is of course a major component and another absolutely fantastic library, but I feel these three are the real pillars of scientific Python.

So without any ado,

let’s get on with the third post in this series on scientific Python and take a look at Pandas. Don’t forget to check out the other posts if you haven’t yet!


First thing to do its to import the star of the show, Pandas.

This is the standard way to import Pandas. We don’t want to be writing ‘pandas’ all the time but it’s important to keep code concise and avoid naming clashes so we compromise with ‘pd’. If you look at other people’s code that uses Pandas you will see this import.


Pandas is based around two data types, the series and the dataframe.