Учимся работать с сериями в Python с помощью библиотеки pandas

Python Pandas Series are one-dimensional labeled arrays that can hold data of any type. They are similar to a column in a spreadsheet or a SQL table.

To create a Pandas Series, you can pass a list, a NumPy array, or a dictionary to the Series() function.

import pandas as pd

# Creating a Series from a list
data = ['apple', 'banana', 'orange']
series = pd.Series(data)
print(series)

The output will be:

0     apple
1    banana
2    orange
dtype: object

In this example, the list ['apple', 'banana', 'orange'] is passed to the Series() function, and it creates a Series object with indexes 0, 1, and 2.

You can also specify custom indexes for the Series:

import pandas as pd

# Creating a Series with custom indexes
data = ['apple', 'banana', 'orange']
index = ['a', 'b', 'c']
series = pd.Series(data, index=index)
print(series)

The output will be:

a     apple
b    banana
c    orange
dtype: object

This time, the Series object is created with custom indexes 'a', 'b', and 'c'.

You can also access elements of a Series using the indexes:

import pandas as pd

data = ['apple', 'banana', 'orange']
series = pd.Series(data)

print(series[0])  # Output: apple
print(series[1])  # Output: banana
print(series[2])  # Output: orange

The output will be:

apple
banana
orange
dtype: object

In this example, we access the elements of the Series using the indexes 0, 1, and 2.

That's a quick overview of Python Pandas Series! They are powerful data structures for analyzing and manipulating data.

Детальный ответ

Python Pandas Series

What is a Series in Python Pandas?

A Series in Python Pandas is a one-dimensional labeled array capable of storing data of any type (integer, string, float, etc.). It is similar to a column in a spreadsheet or a SQL table. Each element in a series is associated with a unique label called an index. This allows for easy and efficient data manipulation and analysis.

Creating a Series

To create a series, we can pass a list, an array, or a dictionary to the pandas.Series() constructor.

import pandas as pd

# Create a series from a list
my_list = [10, 20, 30, 40, 50]
series_from_list = pd.Series(my_list)
print(series_from_list)

# Create a series from an array
import numpy as np

my_array = np.array([1, 2, 3, 4, 5])
series_from_array = pd.Series(my_array)
print(series_from_array)

# Create a series from a dictionary
my_dict = {'a': 1, 'b': 2, 'c': 3}
series_from_dict = pd.Series(my_dict)
print(series_from_dict)

Output:

0    10
1    20
2    30
3    40
4    50
dtype: int64

0    1
1    2
2    3
3    4
4    5
dtype: int64

a    1
b    2
c    3
dtype: int64

Working with a Series

Once we have created a series, we can perform various operations and manipulations on it.

Accessing Elements

To access elements in a series, we can use [] notation with the index label. We can also use slice notation to retrieve a range of elements.

my_series = pd.Series([1, 2, 3, 4, 5], index=['a', 'b', 'c', 'd', 'e'])

# Accessing a single element
print(my_series['a'])

# Accessing multiple elements using slice notation
print(my_series['b':'d'])

Output:

1

b    2
c    3
d    4
dtype: int64

Operations on Series

We can perform various mathematical operations on a series, such as addition, subtraction, multiplication, and division. These operations are performed element-wise. We can also apply mathematical functions to a series.

series1 = pd.Series([1, 2, 3, 4, 5])
series2 = pd.Series([10, 20, 30, 40, 50])

# Addition
sum_series = series1 + series2
print(sum_series)

# Subtraction
diff_series = series1 - series2
print(diff_series)

# Multiplication
prod_series = series1 * series2
print(prod_series)

# Division
quot_series = series1 / series2
print(quot_series)

# Applying a mathematical function
squared_series = series1.apply(lambda x: x**2)
print(squared_series)

Output:

0    11
1    22
2    33
3    44
4    55
dtype: int64

0    -9
1   -18
2   -27
3   -36
4   -45
dtype: int64

0     10
1     40
2     90
3    160
4    250
dtype: int64

0    0.1
1    0.1
2    0.1
3    0.1
4    0.1
dtype: float64

0     1
1     4
2     9
3    16
4    25
dtype: int64

Filtering

We can filter a series based on certain conditions using boolean indexing.

my_series = pd.Series([10, 20, 30, 40, 50])

# Filtering elements greater than 30
filtered_series = my_series[my_series > 30]
print(filtered_series)

Output:

3    40
4    50
dtype: int64

Missing Data

In a series, missing data is represented by NaN (Not a Number).

my_series = pd.Series([10, 20, np.nan, 40, np.nan])

# Checking for missing data
print(my_series.isnull())

# Dropping missing data
my_series_without_nan = my_series.dropna()
print(my_series_without_nan)

Output:

0    False
1    False
2     True
3    False
4     True
dtype: bool

0    10.0
1    20.0
3    40.0
dtype: float64

Conclusion

Python Pandas Series is a powerful data structure that allows us to store and manipulate data efficiently. In this article, we covered the basics of creating a series, accessing elements, performing operations, filtering, and handling missing data. By mastering the concepts and techniques discussed here, you will be able to effectively use series in your data analysis and data manipulation tasks.

Видео по теме

Основы Pandas Python | Series, DataFrame И Анализ Данных

Python Pandas Tutorial (Part 2): DataFrame and Series Basics - Selecting Rows and Columns

Python Pandas Tutorial : Series and DataFrame Basics #2

Похожие статьи:

Учимся работать с сериями в Python с помощью библиотеки pandas