Note : This note-book is created for placement preparation. It is created from multiple resource of from internet. The purpose of this note book is that it can help me in my journey. It is only for educational purpose.¶
NumPy is a Python library.
NumPy is used for working with arrays.
NumPy is short for "Numerical Python".
It is a library consisting of multidimensional array objects and a collection of routines for processing of array.
import numpy as np
NumPy Introduction¶
What is NumPy?¶
NumPy is a Python library used for working with arrays.
It also has functions for working in domain of linear algebra, fourier transform, and matrices.
NumPy was created in 2005 by Travis Oliphant. It is an open source project and you can use it freely.
NumPy stands for Numerical Python.
Why Use NumPy?¶
In Python we have lists that serve the purpose of arrays, but they are slow to process.
NumPy aims to provide an array object that is up to 50x faster than traditional Python lists.
The array object in NumPy is called ndarray, it provides a lot of supporting functions that make working with ndarray very easy.
Arrays are very frequently used in data science, where speed and resources are very important.
NumPy is a Python library and is written partially in Python, but most of the parts that require fast computation are written in C or C++.
Why is NumPy Faster Than Lists?¶
NumPy arrays are stored at one continuous place in memory unlike lists, so processes can access and manipulate them very efficiently.
This behavior is called locality of reference in computer science.
This is the main reason why NumPy is faster than lists. Also it is optimized to work with latest CPU architectures.
Operations using NumPy¶
- Using NumPy, a developer can perform the following operations
Mathematical and logical operations on arrays.
Fourier transforms and routines for shape manipulation.
Operations related to linear algebra. NumPy has in-built functions for linear algebra and random number generation.
NumPy Getting Started¶
Installation of NumPy¶
pip install numpy
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (1.22.4)
Import NumPy¶
import numpy
arr = numpy.array([1, 2, 3, 4, 5])
print(arr)
[1 2 3 4 5]
NumPy as np¶
NumPy is usually imported under the np alias.
alias: In Python alias are an alternate name for referring to the same thing.
import numpy as np
Checking NumPy Version¶
- The version string is stored under version attribute.
print(np.__version__)
1.22.4
NumPy Creating Arrays¶
Create a NumPy ndarray Object¶
NumPy is used to work with arrays. The array object in NumPy is called ndarray. (N-dimensional array).
We can create a NumPy ndarray object by using the array() function.
type(): This built-in Python function tells us the type of the object passed to it. Like in above code it shows that arr is numpy.ndarray type.
It describes the collection of items of the same type.
Every item in an ndarray takes the same size of block in the memory. Each element in ndarray is an object of data-type object (called dtype).
It creates an ndarray from any object exposing array interface, or from any method that returns an array.
numpy.array(object, dtype = None, copy = True, order = None, subok = False, ndmin = 0)
- The above constructor takes the following parameters −
| No. | Parameter | Description |
|---|---|---|
| 1 | object | Any object exposing the array interface method returns an array, or any (nested) sequence. |
| 2 | dtype | Desired data type of array, optional |
| 3 | copy | Optional. By default (true), the object is copied |
| 4 | order | C (row major) or F (column major) or A (any) (default) |
| 5 | subok | By default, returned array forced to be a base class array. If true, sub-classes passed through |
| 6 | ndmin | Specifies minimum dimensions of resultant array |
arr = np.array([1, 2, 3, 4, 5])
print(arr)
print(type(arr))
[1 2 3 4 5] <class 'numpy.ndarray'>
- To create an ndarray, we can pass a list, tuple or any array-like object into the array() method, and it will be converted into an ndarray.
arr = np.array((1, 2, 3, 4, 5))
print(arr)
[1 2 3 4 5]
Dimensions in Arrays¶
A dimension in arrays is one level of array depth (nested arrays).
nested array: are arrays that have arrays as their elements.
# minimum dimensions
a = np.array([1, 2, 3,4,5], ndmin = 2)
print(a)
[[1 2 3 4 5]]
0-D Arrays¶
- 0-D arrays, or Scalars, are the elements in an array. Each value in an array is a 0-D array.
arr = np.array(42)
print(arr)
42
1-D Arrays¶
An array that has 0-D arrays as its elements is called uni-dimensional or 1-D array.
These are the most common and basic arrays.
arr = np.array([1, 2, 3, 4, 5])
print(arr)
[1 2 3 4 5]
2-D Arrays¶
An array that has 1-D arrays as its elements is called a 2-D array.
These are often used to represent matrix or 2nd order tensors.
NumPy has a whole sub module dedicated towards matrix operations called numpy.mat
arr = np.array([[1, 2, 3], [4, 5, 6]])
print(arr)
[[1 2 3] [4 5 6]]
3-D arrays¶
An array that has 2-D arrays (matrices) as its elements is called 3-D array.
These are often used to represent a 3rd order tensor.
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(arr)
[[[1 2 3] [4 5 6]] [[1 2 3] [4 5 6]]]
Check Number of Dimensions?¶
- NumPy Arrays provides the ndim attribute that returns an integer that tells us how many dimensions the array have.
a = np.array(42)
b = np.array([1, 2, 3, 4, 5])
c = np.array([[1, 2, 3], [4, 5, 6]])
d = np.array([[[1, 2, 3], [4, 5, 6]], [[1, 2, 3], [4, 5, 6]]])
print(a.ndim)
print(b.ndim)
print(c.ndim)
print(d.ndim)
0 1 2 3
Higher Dimensional Arrays¶
An array can have any number of dimensions.
When the array is created, you can define the number of dimensions by using the ndmin argument.
#Create an array with 5 dimensions and verify that it has 5 dimensions:
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print('number of dimensions :', arr.ndim)
[[[[[1 2 3 4]]]]] number of dimensions : 5
- In this array the innermost dimension (5th dim) has 4 elements, the 4th dim has 1 element that is the vector, the 3rd dim has 1 element that is the matrix with the vector, the 2nd dim has 1 element that is 3D array and 1st dim has 1 element that is a 4D array.
NumPy Array Indexing¶
Access Array Elements¶
Array indexing is the same as accessing an array element.
You can access an array element by referring to its index number.
The indexes in NumPy arrays start with 0, meaning that the first element has index 0, and the second has index 1 etc.
arr = np.array([1, 2, 3, 4])
print(arr[0])
print(arr[1])
print(arr[2] + arr[3])
1 2 7
Access 2-D Arrays¶
To access elements from 2-D arrays we can use comma separated integers representing the dimension and the index of the element.
Think of 2-D arrays like a table with rows and columns, where the dimension represents the row and the index represents the column.
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('2nd element on 1st row: ', arr[0, 1])
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('5th element on 2nd row: ', arr[1, 4])
2nd element on 1st row: 2 5th element on 2nd row: 10
Access 3-D Arrays¶
- To access elements from 3-D arrays we can use comma separated integers representing the dimensions and the index of the element.
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
print(arr[0, 1, 2])
6
Negative Indexing¶
- Use negative indexing to access an array from the end.
arr = np.array([[1,2,3,4,5], [6,7,8,9,10]])
print('Last element from 2nd dim: ', arr[1, -1])
Last element from 2nd dim: 10
It is possible to make a selection from ndarray that is a non-tuple sequence, ndarray object of integer or Boolean data type, or a tuple with at least one item being a sequence object. Advanced indexing always returns a copy of the data. As against this, the slicing only presents a view.
There are two types of advanced indexing − Integer and Boolean.
Advanced Indexing¶
Integer Indexing¶
- This mechanism helps in selecting any arbitrary item in an array based on its Ndimensional index. Each integer array represents the number of indexes into that dimension. When the index consists of as many integer arrays as the dimensions of the target ndarray, it becomes straightforward.
x = np.array([[1, 2], [3, 4], [5, 6]])
y = x[[0,1,2], [0,1,0]]
print(y)
#The selection includes elements at (0,0), (1,1) and (2,0) from the first array.
[1 4 5]
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print('Our array is:')
print(x)
print('\n')
rows = np.array([[0,0],[3,3]])
cols = np.array([[0,2],[0,2]])
y = x[rows,cols]
print('The corner elements of this array are:')
print(y)
#The resultant selection is an ndarray object containing corner elements.
Our array is: [[ 0 1 2] [ 3 4 5] [ 6 7 8] [ 9 10 11]] The corner elements of this array are: [[ 0 2] [ 9 11]]
- Advanced and basic indexing can be combined by using one slice (:) or ellipsis (…) with an index array. The following example uses slice for row and advanced index for column. The result is the same when slice is used for both. But advanced index results in copy and may have different memory layout.
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print('Our array is:')
print(x)
print('\n')
# slicing
z = x[1:4,1:3]
print('After slicing, our array becomes:')
print(z)
print('\n')
# using advanced index for column
y = x[1:4,[1,2]]
print('Slicing using advanced index for column:')
print(y)
Our array is: [[ 0 1 2] [ 3 4 5] [ 6 7 8] [ 9 10 11]] After slicing, our array becomes: [[ 4 5] [ 7 8] [10 11]] Slicing using advanced index for column: [[ 4 5] [ 7 8] [10 11]]
Boolean Array Indexing¶
- This type of advanced indexing is used when the resultant object is meant to be the result of Boolean operations, such as comparison operators.
x = np.array([[ 0, 1, 2],[ 3, 4, 5],[ 6, 7, 8],[ 9, 10, 11]])
print('Our array is:')
print(x)
print('\n')
# Now we will print the items greater than 5
print('The items greater than 5 are:')
print(x[x > 5])
Our array is: [[ 0 1 2] [ 3 4 5] [ 6 7 8] [ 9 10 11]] The items greater than 5 are: [ 6 7 8 9 10 11]
# In this example, NaN (Not a Number) elements are omitted by using ~ (complement operator).
a = np.array([np.nan, 1,2,np.nan,3,4,5])
print(a[~np.isnan(a)])
[1. 2. 3. 4. 5.]
#The following example shows how to filter out the non-complex elements from an array.
a = np.array([1, 2+6j, 5, 3.5+5j])
print(a[np.iscomplex(a)])
[2. +6.j 3.5+5.j]
NumPy Array Slicing¶
Slicing arrays¶
Slicing in python means taking elements from one given index to another given index.
We pass slice instead of index like this: [start:end].
We can also define the step, like this: [start:end:step].
If we don't pass start its considered 0
If we don't pass end its considered length of array in that dimension
If we don't pass step its considered 1
Note: The result includes the start index, but excludes the end index.
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5])
print(arr[4:])
print(arr[:4])
[2 3 4 5] [5 6 7] [1 2 3 4]
# array to begin with
import numpy as np
a = np.array([[1,2,3],[3,4,5],[4,5,6]])
print('Our array is:')
print(a)
print('\n')
# this returns array of items in the second column
print ('The items in the second column are:')
print (a[...,1])
print('\n')
# Now we will slice all items from the second row
print ('The items in the second row are:')
print (a[1,...])
print('\n')
# Now we will slice all items from column 1 onwards
print ('The items column 1 onwards are:')
print (a[...,1:])
Our array is: [[1 2 3] [3 4 5] [4 5 6]] The items in the second column are: [2 4 5] The items in the second row are: [3 4 5] The items column 1 onwards are: [[2 3] [4 5] [5 6]]
Negative Slicing¶
- Use the minus operator to refer to an index from the end.
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[-3:-1])
[5 6]
STEP¶
- Use the step value to determine the step of the slicing
arr = np.array([1, 2, 3, 4, 5, 6, 7])
print(arr[1:5:2])
print(arr[::2])
[2 4] [1 3 5 7]
Slicing 2-D Arrays¶
arr = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]])
print(arr[1, 1:4])
print(arr[0:2, 2])
print(arr[0:2, 1:4])
[7 8 9] [3 8] [[2 3 4] [7 8 9]]
a = np.arange(10)
s = slice(2,7,2)
print(a[s])
[2 4 6]
NumPy Data Types¶
Data Types in Python¶
strings - used to represent text data, the text is given under quote marks. e.g. "ABCD"
integer - used to represent integer numbers. e.g. -1, -2, -3
float - used to represent real numbers. e.g. 1.2, 42.42
boolean - used to represent True or False.
complex - used to represent complex numbers. e.g. 1.0 + 2.0j, 1.5 + 2.5j
Data Types in NumPy¶
- NumPy supports a much greater variety of numerical types than Python does. The following table shows different scalar data types defined in NumPy.
| No. | Data Types | Description |
|---|---|---|
| 1 | bool_ | Boolean (True or False) stored as a byte |
| 2 | int_ | Default integer type (same as C long; normally either int64 or int32) |
| 3 | intc | Identical to C int (normally int32 or int64) |
| 4 | intp | Integer used for indexing (same as C ssize_t; normally either int32 or int64) |
| 5 | int8 | Byte (-128 to 127) |
| 6 | int16 | Integer (-32768 to 32767) |
| 7 | int32 | Integer (-2147483648 to 2147483647) |
| 8 | int64 | Integer (-9223372036854775808 to 9223372036854775807) |
| 9 | uint8 | Unsigned integer (0 to 255) |
| 10 | uint16 | Unsigned integer (0 to 65535) |
| 11 | uint32 | Unsigned integer (0 to 4294967295) |
| 12 | uint64 | Unsigned integer (0 to 18446744073709551615) |
| 13 | float_ | Shorthand for float64 |
| 14 | float16 | Half precision float: sign bit, 5 bits exponent, 10 bits mantissa |
| 15 | float32 | Single precision float: sign bit, 8 bits exponent, 23 bits mantissa |
| 16 | float64 | Double precision float: sign bit, 11 bits exponent, 52 bits mantissa |
| 17 | complex_ | Shorthand for complex128 |
| 18 | complex64 | Complex number, represented by two 32-bit floats (real and imaginary components) |
| 19 | complex128 | Complex number, represented by two 64-bit floats (real and imaginary components) |
- NumPy numerical types are instances of dtype (data-type) objects, each having unique characteristics. The dtypes are available as np.bool_, np.float32, etc.
NumPy has some extra data types, and refer to data types with one character, like i for integers, u for unsigned integers etc.
Below is a list of all data types in NumPy and the characters used to represent them.
i - integer
b - boolean
u - unsigned integer
f - float
c - complex float
m - timedelta
M - datetime
O - object
S - string
U - unicode string
V - fixed chunk of memory for other type ( void )
Checking the Data Type of an Array¶
Data Type Objects (dtype)
The NumPy array object has a property called dtype that returns the data type of the array
A data type object describes interpretation of fixed block of memory corresponding to an array, depending on the following aspects
- Type of data (integer, float or Python object)
- Size of data
- Byte order (little-endian or big-endian)
- In case of structured type, the names of fields, data type of each field and part of the memory block taken by each field.
- If data type is a subarray, its shape and data type
The byte order is decided by prefixing '<' or '>' to data type. '<' means that encoding is little-endian (least significant is stored in smallest address). '>' means that encoding is big-endian (most significant byte is stored in smallest address).
A dtype object is constructed using the following syntax
numpy.dtype(object, align, copy)
- The parameters are :
- Object : To be converted to data type object
- Align : If true, adds padding to the field to make it similar to C-struct
- Copy : Makes a new copy of dtype object. If false, the result is reference to builtin data type object
arr = np.array([1, 2, 3, 4])
print(arr.dtype)
arr = np.array(['apple', 'banana', 'cherry'])
print(arr.dtype)
int64 <U6
import numpy as np
dt = np.dtype(np.int32)
print(dt)
int32
Creating Arrays With a Defined Data Type¶
- We use the array() function to create arrays, this function can take an optional argument: dtype that allows us to define the expected data type of the array elements
arr = np.array([1, 2, 3, 4], dtype='S')
print(arr)
print(arr.dtype)
[b'1' b'2' b'3' b'4'] |S1
# For i, u, f, S and U we can define size as well.
# Create an array with data type 4 bytes integer
arr = np.array([1, 2, 3, 4], dtype='i4')
print(arr)
print(arr.dtype)
[1 2 3 4] int32
a = np.array([1, 2, 3], dtype = complex)
print(a)
[1.+0.j 2.+0.j 3.+0.j]
#int8, int16, int32, int64 can be replaced by equivalent string 'i1', 'i2','i4', etc.
dt = np.dtype('i4')
print(dt)
int32
# using endian notation
dt = np.dtype('>i4')
print(dt)
>i4
# first create structured data type
dt = np.dtype([('age',np.int8)])
print(dt)
[('age', 'i1')]
# now apply it to ndarray object
dt = np.dtype([('age',np.int8)])
a = np.array([(10,),(20,),(30,)], dtype = dt)
print(a)
[(10,) (20,) (30,)]
# file name can be used to access content of age column
dt = np.dtype([('age',np.int8)])
a = np.array([(10,),(20,),(30,)], dtype = dt)
print(a['age'])
[10 20 30]
The following examples define a structured data type called student with a string field 'name', an integer field 'age' and a float field 'marks'. This dtype is applied to ndarray object.¶
student = np.dtype([('name','S20'), ('age', 'i1'), ('marks', 'f4')])
print(student)
[('name', 'S20'), ('age', 'i1'), ('marks', '<f4')]
student = np.dtype([('name','S20'), ('age', 'i1'), ('marks', 'f4')])
a = np.array([('abc', 21, 50),('xyz', 18, 75)], dtype = student)
print(a)
[(b'abc', 21, 50.) (b'xyz', 18, 75.)]
What if a Value Can Not Be Converted?¶
If a type is given in which elements can't be casted then NumPy will raise a ValueError.
ValueError: In Python ValueError is raised when the type of passed argument to a function is unexpected/incorrect.
# A non integer string like 'a' can not be converted to integer (will raise an error):
#arr = np.array(['a', '2', '3'], dtype='i')
Converting Data Type on Existing Arrays¶
The best way to change the data type of an existing array, is to make a copy of the array with the astype() method.
The astype() function creates a copy of the array, and allows you to specify the data type as a parameter.
The data type can be specified using a string, like 'f' for float, 'i' for integer etc. or you can use the data type directly like float for float and int for integer.
#Change data type from float to integer by using 'i' as parameter value:
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype('i')
print(newarr)
print(newarr.dtype)
[1 2 3] int32
#Change data type from float to integer by using int as parameter value:
arr = np.array([1.1, 2.1, 3.1])
newarr = arr.astype(int)
print(newarr)
print(newarr.dtype)
[1 2 3] int64
#Change data type from integer to boolean:
arr = np.array([1, 0, 3])
newarr = arr.astype(bool)
print(newarr)
print(newarr.dtype)
[ True False True] bool
NumPy Array Copy vs View¶
The Difference Between Copy and View¶
The main difference between a copy and a view of an array is that the copy is a new array, and the view is just a view of the original array.
The copy owns the data and any changes made to the copy will not affect original array, and any changes made to the original array will not affect the copy.
The view does not own the data and any changes made to the view will affect the original array, and any changes made to the original array will affect the view.
No Copy¶
Simple assignments do not make the copy of array object. Instead, it uses the same id() of the original array to access it. The id() returns a universal identifier of Python object, similar to the pointer in C.
Furthermore, any changes in either gets reflected in the other. For example, the changing shape of one will change the shape of the other too.
a = np.arange(6)
print('Our array is:')
print(a)
print('Applying id() function:')
print(id(a))
print('a is assigned to b:')
b = a
print(b)
print('b has same id():')
print(id(b))
print('Change shape of b:')
b.shape = 3,2
print(b)
print('Shape of a also gets changed:')
print(a)
Our array is: [0 1 2 3 4 5] Applying id() function: 137931171056688 a is assigned to b: [0 1 2 3 4 5] b has same id(): 137931171056688 Change shape of b: [[0 1] [2 3] [4 5]] Shape of a also gets changed: [[0 1] [2 3] [4 5]]
COPY:¶
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
arr[0] = 42
print(arr)
print(x)
[42 2 3 4 5] [1 2 3 4 5]
VIEW:¶
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
arr[0] = 42
print(arr)
print(x)
[42 2 3 4 5] [42 2 3 4 5]
Make Changes in the VIEW:¶
- The original array SHOULD be affected by the changes made to the view.
arr = np.array([1, 2, 3, 4, 5])
x = arr.view()
x[0] = 31
print(arr)
print(x)
[31 2 3 4 5] [31 2 3 4 5]
Check if Array Owns its Data¶
As mentioned above, copies owns the data, and views does not own the data, but how can we check this?
Every NumPy array has the attribute base that returns None if the array owns the data.
Otherwise, the base attribute refers to the original object.
The copy returns None. The view returns the original array.
arr = np.array([1, 2, 3, 4, 5])
x = arr.copy()
y = arr.view()
print(x.base)
print(y.base)
None [1 2 3 4 5]
NumPy Array Shape¶
Shape of an Array¶
The shape of an array is the number of elements in each dimension.
NumPy arrays have an attribute called shape that returns a tuple with each index having the number of corresponding elements.
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
print(arr.shape)
(2, 4)
# this resizes the ndarray
a = np.array([[1,2,3],[4,5,6]])
a.shape = (3,2)
print(a)
[[1 2] [3 4] [5 6]]
ndarray.ndim¶
- This array attribute returns the number of array dimensions.
#Create an array with 5 dimensions using ndmin using a vector with values 1,2,3,4 and verify that last dimension has value 4:
arr = np.array([1, 2, 3, 4], ndmin=5)
print(arr)
print('shape of array :', arr.shape)
[[[[[1 2 3 4]]]]] shape of array : (1, 1, 1, 1, 4)
# an array of evenly spaced numbers
a = np.arange(24)
print(a)
[ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23]
# this is one dimensional array
a = np.arange(24)
a.ndim
# now reshape it
b = a.reshape(2,4,3)
print(b)
# b is having three dimensions
[[[ 0 1 2] [ 3 4 5] [ 6 7 8] [ 9 10 11]] [[12 13 14] [15 16 17] [18 19 20] [21 22 23]]]
numpy.itemsize¶
- This array attribute returns the length of each element of array in bytes.
# dtype of array is int8 (1 byte)
x = np.array([1,2,3,4,5], dtype = np.int8)
print(x.itemsize)
1
# dtype of array is now float32 (4 bytes)
x = np.array([1,2,3,4,5], dtype = np.float32)
print(x.itemsize)
4
numpy.flags¶
- The ndarray object has the following attributes. Its current values are returned by this function.
| No. | Attribute | Description |
|---|---|---|
| 1 | C_CONTIGUOUS (C) | The data is in a single, C-style contiguous segment |
| 2 | F_CONTIGUOUS (F) | The data is in a single, Fortran-style contiguous segment |
| 3 | OWNDATA (O) | The array owns the memory it uses or borrows it from another object |
| 4 | WRITEABLE (W) | The data area can be written to. Setting this to False locks the data, making it read-only |
| 5 | ALIGNED (A) | The data and all elements are aligned appropriately for the hardware |
| 6 | UPDATEIFCOPY (U) | This array is a copy of some other array. When this array is deallocated, the base array will be updated with the contents of this array |
x = np.array([1,2,3,4,5])
print(x.flags)
C_CONTIGUOUS : True F_CONTIGUOUS : True OWNDATA : True WRITEABLE : True ALIGNED : True WRITEBACKIFCOPY : False UPDATEIFCOPY : False
What does the shape tuple represent?¶
Integers at every index tells about the number of elements the corresponding dimension has.
In the example above at index-4 we have value 4, so we can say that 5th ( 4 + 1 th) dimension has 4 elements.
NumPy Array Reshaping¶
Reshaping arrays¶
Reshaping means changing the shape of an array.
The shape of an array is the number of elements in each dimension.
By reshaping we can add or remove dimensions or change number of elements in each dimension.
Reshape From 1-D to 2-D¶
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(4, 3)
print(newarr)
[[ 1 2 3] [ 4 5 6] [ 7 8 9] [10 11 12]]
Reshape From 1-D to 3-D¶
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
newarr = arr.reshape(2, 3, 2)
print(newarr)
[[[ 1 2] [ 3 4] [ 5 6]] [[ 7 8] [ 9 10] [11 12]]]
Can We Reshape Into any Shape?¶
Yes, as long as the elements required for reshaping are equal in both shapes.
We can reshape an 8 elements 1D array into 4 elements in 2 rows 2D array but we cannot reshape it into a 3 elements 3 rows 2D array as that would require 3x3 = 9 elements.
#Try converting 1D array with 8 elements to a 2D array with 3 elements in each dimension (will raise an error):
#arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
#newarr = arr.reshape(3, 3)
#print(newarr)
Returns Copy or View?¶
#Check if the returned array is a copy or a view
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
print(arr.reshape(2, 4).base)
#The example above returns the original array, so it is a view.
[1 2 3 4 5 6 7 8]
Unknown Dimension¶
You are allowed to have one "unknown" dimension.
Meaning that you do not have to specify an exact number for one of the dimensions in the reshape method.
Pass -1 as the value, and NumPy will calculate this number for you.
Note: We can not pass -1 to more than one dimension.
#Convert 1D array with 8 elements to 3D array with 2x2 elements
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
newarr = arr.reshape(2, 2, -1)
print(newarr)
[[[1 2] [3 4]] [[5 6] [7 8]]]
Flattening the arrays¶
Flattening array means converting a multidimensional array into a 1D array.
We can use reshape(-1) to do this.
Note: There are a lot of functions for changing the shapes of arrays in numpy flatten, ravel and also for rearranging the elements rot90, flip, fliplr, flipud etc. These fall under Intermediate to Advanced section of numpy.
arr = np.array([[1, 2, 3], [4, 5, 6]])
newarr = arr.reshape(-1)
print(newarr)
[1 2 3 4 5 6]
NumPy Array Iterating¶
Iterating Arrays¶
Iterating means going through elements one by one.
As we deal with multi-dimensional arrays in numpy, we can do this using basic for loop of python.
If we iterate on a 1-D array it will go through each element one by one.
arr = np.array([1, 2, 3])
for x in arr:
print(x)
1 2 3
Iterating 2-D Arrays¶
In a 2-D array it will go through all the rows.
If we iterate on a n-D array it will go through n-1th dimension one by one.
arr = np.array([[1, 2, 3], [4, 5, 6]])
for x in arr:
print(x)
[1 2 3] [4 5 6]
# To return the actual values, the scalars, we have to iterate the arrays in each dimension.
#Iterate on each scalar element of the 2-D array:
arr = np.array([[1, 2, 3], [4, 5, 6]])
for x in arr:
for y in x:
print(y)
1 2 3 4 5 6
Iterating 3-D Arrays¶
- In a 3-D array it will go through all the 2-D arrays.
# Iterate on the elements of the following 3-D array:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in arr:
print(x)
[[1 2 3] [4 5 6]] [[ 7 8 9] [10 11 12]]
#To return the actual values, the scalars, we have to iterate the arrays in each dimension.
#Iterate down to the scalars:
arr = np.array([[[1, 2, 3], [4, 5, 6]], [[7, 8, 9], [10, 11, 12]]])
for x in arr:
for y in x:
for z in y:
print(z)
1 2 3 4 5 6 7 8 9 10 11 12
Iterating Arrays Using nditer()¶
- The function nditer() is a helping function that can be used from very basic to very advanced iterations. It solves some basic issues which we face in iteration, lets go through it with examples.
- It is an efficient multidimensional iterator object using which it is possible to iterate over an array. Each element of an array is visited using Python’s standard Iterator interface.
Iterating on Each Scalar Element¶
- In basic for loops, iterating through each scalar of an array we need to use n for loops which can be difficult to write for arrays with very high dimensionality.
arr = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
for x in np.nditer(arr):
print(x)
1 2 3 4 5 6 7 8
- The order of iteration is chosen to match the memory layout of an array, without considering a particular ordering. This can be seen by iterating over the transpose of the above array.
a = np.arange(0,60,5)
a = a.reshape(3,4)
print('Original array is:')
print(a)
print('\n')
print('Transpose of the original array is:')
b = a.T
print(b)
print('\n')
print('Modified array is:')
for x in np.nditer(b):
print (x)
Original array is: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] Transpose of the original array is: [[ 0 20 40] [ 5 25 45] [10 30 50] [15 35 55]] Modified array is: 0 5 10 15 20 25 30 35 40 45 50 55
Iteration Order¶
- If the same elements are stored using F-style order, the iterator chooses the more efficient way of iterating over an array.
import numpy as np
a = np.arange(0,60,5)
a = a.reshape(3,4)
print('Original array is:')
print(a)
print('\n')
print('Transpose of the original array is:')
b =(a.T)
print(b)
print ('\n')
print ('Sorted in C-style order:')
c = b.copy(order = 'C')
print(c)
for x in np.nditer(c):
print(x)
print ('\n')
print ('Sorted in F-style order:')
c = b.copy(order = 'F')
print(c)
for x in np.nditer(c):
print(x)
Original array is: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] Transpose of the original array is: [[ 0 20 40] [ 5 25 45] [10 30 50] [15 35 55]] Sorted in C-style order: [[ 0 20 40] [ 5 25 45] [10 30 50] [15 35 55]] 0 20 40 5 25 45 10 30 50 15 35 55 Sorted in F-style order: [[ 0 20 40] [ 5 25 45] [10 30 50] [15 35 55]] 0 5 10 15 20 25 30 35 40 45 50 55
- It is possible to force nditer object to use a specific order by explicitly mentioning it.
a = np.arange(0,60,5)
a = a.reshape(3,4)
print('Original array is:')
print(a)
print('\n')
print('Sorted in C-style order:')
for x in np.nditer(a, order = 'C'):
print(x)
print('\n')
print('Sorted in F-style order:')
for x in np.nditer(a, order = 'F'):
print(x)
Original array is: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] Sorted in C-style order: 0 5 10 15 20 25 30 35 40 45 50 55 Sorted in F-style order: 0 20 40 5 25 45 10 30 50 15 35 55
Iterating Array With Different Data Types¶
We can use op_dtypes argument and pass it the expected datatype to change the datatype of elements while iterating.
NumPy does not change the data type of the element in-place (where the element is in array) so it needs some other space to perform this action, that extra space is called buffer, and in order to enable it in nditer() we pass flags=['buffered'].
arr = np.array([1, 2, 3])
for x in np.nditer(arr, flags=['buffered'], op_dtypes=['S']):
print(x)
b'1' b'2' b'3'
Iterating With Different Step Size¶
- We can use filtering and followed by iteration.
#Iterate through every scalar element of the 2D array skipping 1 element:
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
for x in np.nditer(arr[:, ::2]):
print(x)
1 3 5 7
Enumerated Iteration Using ndenumerate()¶
Enumeration means mentioning sequence number of somethings one by one.
Sometimes we require corresponding index of the element while iterating, the ndenumerate() method can be used for those usecases.
#For 1d array
arr = np.array([1, 2, 3])
for idx, x in np.ndenumerate(arr):
print(idx, x)
(0,) 1 (1,) 2 (2,) 3
#For 2d array
arr = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
for idx, x in np.ndenumerate(arr):
print(idx, x)
(0, 0) 1 (0, 1) 2 (0, 2) 3 (0, 3) 4 (1, 0) 5 (1, 1) 6 (1, 2) 7 (1, 3) 8
Modifying Array Values¶
- The nditer object has another optional parameter called op_flags. Its default value is read-only, but can be set to read-write or write-only mode. This will enable modifying array elements using this iterator.
a = np.arange(0,60,5)
a = a.reshape(3,4)
print('Original array is:')
print(a)
print('\n')
for x in np.nditer(a, op_flags = ['readwrite']):
x[...] = 2*x
print('Modified array is:')
print(a)
Original array is: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] Modified array is: [[ 0 10 20 30] [ 40 50 60 70] [ 80 90 100 110]]
External Loop¶
- The nditer class constructor has a ‘flags’ parameter, which can take the following values −
| Parameter | Description |
|---|---|
| c_index | C_order index can be tracked |
| f_index | Fortran_order index is tracked |
| multi-index | Type of indexes with one per iteration can be tracked |
| external_loop | Causes values given to be one-dimensional arrays with multiple values instead of zero-dimensional array |
a = np.arange(0,60,5)
a = a.reshape(3,4)
print('Original array is:')
print(a)
print('\n')
print('Modified array is:')
for x in np.nditer(a, flags = ['external_loop'], order = 'F'):
print(x)
Original array is: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] Modified array is: [ 0 20 40] [ 5 25 45] [10 30 50] [15 35 55]
Broadcasting Iteration¶
- If two arrays are broadcastable, a combined nditer object is able to iterate upon them concurrently. Assuming that an array a has dimension 3X4, and there is another array b of dimension 1X4, the iterator of following type is used (array b is broadcast to size of a).
a = np.arange(0,60,5)
a = a.reshape(3,4)
print('Original array is:')
print(a)
print('\n')
print('Second array is:')
b = np.array([1, 2, 3, 4], dtype = int)
print(b)
print('\n')
print('Modified array is:')
for x,y in np.nditer([a,b]):
print("%d:%d" % (x,y))
Original array is: [[ 0 5 10 15] [20 25 30 35] [40 45 50 55]] Second array is: [1 2 3 4] Modified array is: 0:1 5:2 10:3 15:4 20:1 25:2 30:3 35:4 40:1 45:2 50:3 55:4
NumPy Joining Array¶
Joining means putting contents of two or more arrays in a single array.
In SQL we join tables based on a key, whereas in NumPy we join arrays by axes.
We pass a sequence of arrays that we want to join to the concatenate() function, along with the axis. If axis is not explicitly passed, it is taken as 0.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.concatenate((arr1, arr2))
print(arr)
[1 2 3 4 5 6]
arr1 = np.array([[1, 2], [3, 4]])
arr2 = np.array([[5, 6], [7, 8]])
arr = np.concatenate((arr1, arr2), axis=1)
print(arr)
[[1 2 5 6] [3 4 7 8]]
Joining Arrays Using Stack Functions¶
Stacking is same as concatenation, the only difference is that stacking is done along a new axis.
We can concatenate two 1-D arrays along the second axis which would result in putting them one over the other, ie. stacking.
We pass a sequence of arrays that we want to join to the stack() method along with the axis. If axis is not explicitly passed it is taken as 0.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.stack((arr1, arr2), axis=1)
print(arr)
[[1 4] [2 5] [3 6]]
Stacking Along Rows¶
-NumPy provides a helper function: hstack() to stack along rows.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.hstack((arr1, arr2))
print(arr)
[1 2 3 4 5 6]
Stacking Along Columns¶
- NumPy provides a helper function: vstack() to stack along columns.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.vstack((arr1, arr2))
print(arr)
[[1 2 3] [4 5 6]]
Stacking Along Height (depth)¶
- NumPy provides a helper function: dstack() to stack along height, which is the same as depth.
arr1 = np.array([1, 2, 3])
arr2 = np.array([4, 5, 6])
arr = np.dstack((arr1, arr2))
print(arr)
[[[1 4] [2 5] [3 6]]]
NumPy Splitting Array¶
Splitting NumPy Arrays¶
Splitting is reverse operation of Joining.
Joining merges multiple arrays into one and Splitting breaks one array into multiple.
We use array_split() for splitting arrays, we pass it the array we want to split and the number of splits.
Note: The return value is a list containing three arrays.
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)
print(newarr)
[array([1, 2]), array([3, 4]), array([5, 6])]
- If the array has less elements than required, it will adjust from the end accordingly.
-Note: We also have the method split() available but it will not adjust the elements when elements are less in source array for splitting like in example above, array_split() worked properly but split() would fail.
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 4)
print(newarr)
[array([1, 2]), array([3, 4]), array([5]), array([6])]
Split Into Arrays¶
-The return value of the array_split() method is an array containing each of the split as an array.
- If you split an array into 3 arrays, you can access them from the result just like any array element
arr = np.array([1, 2, 3, 4, 5, 6])
newarr = np.array_split(arr, 3)
print(newarr[0])
print(newarr[1])
print(newarr[2])
[1 2] [3 4] [5 6]
Splitting 2-D Arrays¶
Use the same syntax when splitting 2-D arrays.
Use the array_split() method, pass in the array you want to split and the number of splits you want to do.
arr = np.array([[1, 2], [3, 4], [5, 6], [7, 8], [9, 10], [11, 12]])
newarr = np.array_split(arr, 3)
print(newarr)
[array([[1, 2],
[3, 4]]), array([[5, 6],
[7, 8]]), array([[ 9, 10],
[11, 12]])]
The example above returns three 2-D arrays.
Let's look at another example, this time each element in the 2-D arrays contains 3 elements.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.array_split(arr, 3)
print(newarr)
[array([[1, 2, 3],
[4, 5, 6]]), array([[ 7, 8, 9],
[10, 11, 12]]), array([[13, 14, 15],
[16, 17, 18]])]
The example above returns three 2-D arrays.
In addition, you can specify which axis you want to do the split around.
The example below also returns three 2-D arrays, but they are split along the row (axis=1).
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.array_split(arr, 3, axis=1)
print(newarr)
[array([[ 1],
[ 4],
[ 7],
[10],
[13],
[16]]), array([[ 2],
[ 5],
[ 8],
[11],
[14],
[17]]), array([[ 3],
[ 6],
[ 9],
[12],
[15],
[18]])]
An alternate solution is using hsplit() opposite of hstack()
Note: Similar alternates to vstack() and dstack() are available as vsplit() and dsplit().
#Use the hsplit() method to split the 2-D array into three 2-D arrays along rows.
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.hsplit(arr, 3)
print(newarr)
[array([[ 1],
[ 4],
[ 7],
[10],
[13],
[16]]), array([[ 2],
[ 5],
[ 8],
[11],
[14],
[17]]), array([[ 3],
[ 6],
[ 9],
[12],
[15],
[18]])]
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12], [13, 14, 15], [16, 17, 18]])
newarr = np.vsplit(arr, 3)
print(newarr)
[array([[1, 2, 3],
[4, 5, 6]]), array([[ 7, 8, 9],
[10, 11, 12]]), array([[13, 14, 15],
[16, 17, 18]])]
NumPy Searching Arrays¶
Searching Arrays¶
You can search an array for a certain value, and return the indexes that get a match.
To search an array, use the where() method.
# Find the indexes where the value is 4:
arr = np.array([1, 2, 3, 4, 5, 4, 4])
x = np.where(arr == 4)
print(x)
(array([3, 5, 6]),)
# Find the indexes where the values are even:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x = np.where(arr%2 == 0)
print(x)
(array([1, 3, 5, 7]),)
# Find the indexes where the values are odd:
arr = np.array([1, 2, 3, 4, 5, 6, 7, 8])
x = np.where(arr%2 == 1)
print(x)
(array([0, 2, 4, 6]),)
Search Sorted¶
There is a method called searchsorted() which performs a binary search in the array, and returns the index where the specified value would be inserted to maintain the search order.
The searchsorted() method is assumed to be used on sorted arrays.
# Find the indexes where the value 7 should be inserted:
arr = np.array([6, 7, 8, 9])
x = np.searchsorted(arr, 7)
print(x)
# Example explained: The number 7 should be inserted on index 1 to remain the sort order.
# The method starts the search from the left and returns the first index where the number 7 is no longer larger than the next value.
1
Search From the Right Side¶
- By default the left most index is returned, but we can give side='right' to return the right most index instead.
# Find the indexes where the value 7 should be inserted, starting from the right:
arr = np.array([6, 7, 8, 9])
x = np.searchsorted(arr, 7, side='right')
print(x)
# Example explained: The number 7 should be inserted on index 2 to remain the sort order.
#The method starts the search from the right and returns the first index where the number 7 is no longer less than the next value.
2
Multiple Values¶
- To search for more than one value, use an array with the specified values.
# Find the indexes where the values 2, 4, and 6 should be inserted:
arr = np.array([1, 3, 5, 7])
x = np.searchsorted(arr, [2, 4, 6])
print(x)
[1 2 3]
NumPy Sorting Arrays¶
Sorting Arrays¶
Sorting means putting elements in an ordered sequence.
Ordered sequence is any sequence that has an order corresponding to elements, like numeric or alphabetical, ascending or descending.
The NumPy ndarray object has a function called sort(), that will sort a specified array.
Note: This method returns a copy of the array, leaving the original array unchanged
You can also sort arrays of strings, or any other data type
numpy.sort(a, axis, kind, order)
| Parameter | Description |
|---|---|
| a | Array to be sorted |
| axis | The axis along which the array is to be sorted. If none, the array is flattened, sorting on the last axis |
| kind | Default is quicksort |
| order | If the array contains fields, the order of fields to be sorted |
arr = np.array([3, 2, 0, 1])
print(np.sort(arr))
[0 1 2 3]
# Sort the array alphabetically
arr = np.array(['banana', 'cherry', 'apple'])
print(np.sort(arr))
['apple' 'banana' 'cherry']
# Sort a boolean array:
arr = np.array([True, False, True])
print(np.sort(arr))
[False True True]
Sorting a 2-D Array¶
import numpy as np
# If you use the sort() method on a 2-D array, both arrays will be sorted
arr = np.array([[3, 2, 4], [5, 0, 1]])
print(np.sort(arr))
print('Sort along axis 0:')
print(np.sort(arr, axis = 0))
print('\n')
[[2 3 4] [0 1 5]] Sort along axis 0: [[3 0 1] [5 2 4]]
# Order parameter in sort function
dt = np.dtype([('name', 'S10'),('age', int)])
a = np.array([("raju",21),("anil",25),("ravi", 17), ("amar",27)], dtype = dt)
print('Our array is:')
print(a)
print('\n')
print('Order by name:')
print(np.sort(a, order = 'name'))
Our array is: [(b'raju', 21) (b'anil', 25) (b'ravi', 17) (b'amar', 27)] Order by name: [(b'amar', 27) (b'anil', 25) (b'raju', 21) (b'ravi', 17)]
numpy.argsort()¶
- The numpy.argsort() function performs an indirect sort on input array, along the given axis and using a specified kind of sort to return the array of indices of data. This indices array is used to construct the sorted array.
x = np.array([3, 1, 2])
print('Our array is:')
print(x)
print('\n')
print('Applying argsort() to x:')
y = np.argsort(x)
print(y)
print('\n')
print('Reconstruct original array in sorted order:')
print(x[y])
print('\n')
print('Reconstruct the original array using loop:')
for i in y:
print(x[i])
Our array is: [3 1 2] Applying argsort() to x: [1 2 0] Reconstruct original array in sorted order: [1 2 3] Reconstruct the original array using loop: 1 2 3
numpy.lexsort()¶
- function performs an indirect sort using a sequence of keys. The keys can be seen as a column in a spreadsheet. The function returns an array of indices, using which the sorted data can be obtained. Note, that the last key happens to be the primary key of sort.
nm = ('raju','anil','ravi','amar')
dv = ('f.y.', 's.y.', 's.y.', 'f.y.')
ind = np.lexsort((dv,nm))
print('Applying lexsort() function:')
print(ind)
print('\n')
print('Use this index to get sorted data:')
print([nm[i] + ", " + dv[i] for i in ind])
Applying lexsort() function: [3 1 0 2] Use this index to get sorted data: ['amar, f.y.', 'anil, s.y.', 'raju, f.y.', 'ravi, s.y.']
- NumPy module has a number of functions for searching inside an array. Functions for finding the maximum, the minimum as well as the elements satisfying a given condition are available.
numpy.argmax() and numpy.argmin()¶
- These two functions return the indices of maximum and minimum elements respectively along the given axis.
a = np.array([[30,40,70],[80,20,10],[50,90,60]])
print('Our array is:')
print(a)
print('\n')
print('Applying argmax() function:')
print(np.argmax(a))
print('\n')
print('Index of maximum number in flattened array')
print(a.flatten())
print('\n')
print('Array containing indices of maximum along axis 0:')
maxindex = np.argmax(a, axis = 0)
print(maxindex)
print('\n')
print('Array containing indices of maximum along axis 1:')
maxindex = np.argmax(a, axis = 1)
print(maxindex)
print('\n')
print('Applying argmin() function:')
minindex = np.argmin(a)
print(minindex)
print('\n')
print('Flattened array:')
print(a.flatten()[minindex])
print('\n')
print('Flattened array along axis 0:')
minindex = np.argmin(a, axis = 0)
print(minindex)
print('\n')
print('Flattened array along axis 1:')
minindex = np.argmin(a, axis = 1)
print(minindex)
Our array is: [[30 40 70] [80 20 10] [50 90 60]] Applying argmax() function: 7 Index of maximum number in flattened array [30 40 70 80 20 10 50 90 60] Array containing indices of maximum along axis 0: [1 2 0] Array containing indices of maximum along axis 1: [2 0 1] Applying argmin() function: 5 Flattened array: 10 Flattened array along axis 0: [0 1 1] Flattened array along axis 1: [0 2 0]
numpy.nonzero()¶
- The numpy.nonzero() function returns the indices of non-zero elements in the input array.
a = np.array([[30,40,0],[0,20,10],[50,0,60]])
print(np.nonzero (a))
(array([0, 0, 1, 1, 2, 2]), array([0, 1, 1, 2, 0, 2]))
numpy.where()¶
- The where() function returns the indices of elements in an input array where the given condition is satisfied.
x = np.arange(9.).reshape(3, 3)
print('Our array is:')
print(x)
print('Indices of elements > 3')
y = np.where(x > 3)
print(y)
print('Use these indices to get elements satisfying the condition')
print(x[y])
Our array is: [[0. 1. 2.] [3. 4. 5.] [6. 7. 8.]] Indices of elements > 3 (array([1, 1, 2, 2, 2]), array([1, 2, 0, 1, 2])) Use these indices to get elements satisfying the condition [4. 5. 6. 7. 8.]
numpy.extract()¶
- The extract() function returns the elements satisfying any condition.
x = np.arange(9.).reshape(3, 3)
print('Our array is:')
print(x)
# define a condition
condition = np.mod(x,2) == 0
print('Element-wise value of condition')
print(condition)
print('Extract elements using condition')
print(np.extract(condition, x))
Our array is: [[0. 1. 2.] [3. 4. 5.] [6. 7. 8.]] Element-wise value of condition [[ True False True] [False True False] [ True False True]] Extract elements using condition [0. 2. 4. 6. 8.]
NumPy Filter Array¶
Filtering Arrays¶
Getting some elements out of an existing array and creating a new array out of them is called filtering.
In NumPy, you filter an array using a boolean index list.
A boolean index list is a list of booleans corresponding to indexes in the array.
If the value at an index is True that element is contained in the filtered array, if the value at that index is False that element is excluded from the filtered array.
# Create an array from the elements on index 0 and 2:
arr = np.array([41, 42, 43, 44])
x = [True, False, True, False]
newarr = arr[x]
print(newarr)
[41 43]
Creating the Filter Array¶
- In the example above we hard-coded the True and False values, but the common use is to create a filter array based on conditions.
# Create a filter array that will return only values higher than 42:
arr = np.array([41, 42, 43, 44])
# Create an empty list
filter_arr = []
# go through each element in arr
for element in arr:
# if the element is higher than 42, set the value to True, otherwise False:
if element > 42:
filter_arr.append(True)
else:
filter_arr.append(False)
newarr = arr[filter_arr]
print(filter_arr)
print(newarr)
[False, False, True, True] [43 44]
# Create a filter array that will return only even elements from the original array:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
# Create an empty list
filter_arr = []
# go through each element in arr
for element in arr:
# if the element is completely divisble by 2, set the value to True, otherwise False
if element % 2 == 0:
filter_arr.append(True)
else:
filter_arr.append(False)
newarr = arr[filter_arr]
print(filter_arr)
print(newarr)
[False, True, False, True, False, True, False] [2 4 6]
Creating Filter Directly From Array¶
The above example is quite a common task in NumPy and NumPy provides a nice way to tackle it.
We can directly substitute the array instead of the iterable variable in our condition and it will work just as we expect it to.
# Create a filter array that will return only values higher than 42:
arr = np.array([41, 42, 43, 44])
filter_arr = arr > 42
newarr = arr[filter_arr]
print(filter_arr)
print(newarr)
[False False True True] [43 44]
# Create a filter array that will return only even elements from the original array:
arr = np.array([1, 2, 3, 4, 5, 6, 7])
filter_arr = arr % 2 == 0
newarr = arr[filter_arr]
print(filter_arr)
print(newarr)
[False True False True False True False] [2 4 6]
NumPy - Array Creation Routines¶
- A new ndarray object can be constructed by any of the following array creation routines or using a low-level ndarray constructor.
numpy.empty¶
- It creates an uninitialized array of specified shape and dtype. It uses the following constructor
numpy.empty(shape, dtype = float, order = 'C')
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| Shape | Shape of an empty array in int or tuple of int |
| Dtype | Desired output data type. Optional |
| Order | 'C' for C-style row-major array, 'F' for FORTRAN style column-major array |
x = np.empty([3,2], dtype = int)
print(x)
# Note − The elements in an array show random values as they are not initialized.
[[3 0] [1 5] [2 4]]
numpy.zeros¶
- Returns a new array of specified size, filled with zeros.
numpy.zeros(shape, dtype = float, order = 'C')
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| Shape | Shape of an empty array in int or sequence of int |
| Dtype | Desired output data type. Optional |
| Order | 'C' for C-style row-major array, 'F' for FORTRAN style column-major array |
# array of five zeros. Default dtype is float
x = np.zeros(5)
print(x)
[0. 0. 0. 0. 0.]
x = np.zeros((5,), dtype = np.int)
print(x)
[0 0 0 0 0]
<ipython-input-257-d0db1ba5f23e>:1: DeprecationWarning: `np.int` is a deprecated alias for the builtin `int`. To silence this warning, use `int` by itself. Doing this will not modify any behavior and is safe. When replacing `np.int`, you may wish to use e.g. `np.int64` or `np.int32` to specify the precision. If you wish to review your current use, check the release note link for additional information. Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations x = np.zeros((5,), dtype = np.int)
# custom type
x = np.zeros((2,2), dtype = [('x', 'i4'), ('y', 'i4')])
print(x)
[[(0, 0) (0, 0)] [(0, 0) (0, 0)]]
numpy.ones¶
- Returns a new array of specified size and type, filled with ones.
numpy.ones(shape, dtype = None, order = 'C')
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| Shape | Shape of an empty array in int or tuple of int |
| Dtype | Desired output data type. Optional |
| Order | 'C' for C-style row-major array, 'F' for FORTRAN style column-major array |
# array of five ones. Default dtype is float
x = np.ones(5)
print(x)
[1. 1. 1. 1. 1.]
x = np.ones([2,2], dtype = int)
print(x)
[[1 1] [1 1]]
NumPy - Array From Existing Data¶
numpy.asarray¶
- This function is similar to numpy.array except for the fact that it has fewer parameters. This routine is useful for converting Python sequence into ndarray.
numpy.asarray(a, dtype = None, order = None)
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| a | Input data in any form such as list, list of tuples, tuples, tuple of tuples or tuple of lists |
| dtype | By default, the data type of input data is applied to the resultant ndarray |
| order | C (row major) or F (column major). C is default |
# convert list to ndarray
x = [1,2,3]
a = np.asarray(x)
print(a)
[1 2 3]
# dtype is set
x = [1,2,3]
a = np.asarray(x, dtype = float)
print(a)
[1. 2. 3.]
# ndarray from tuplex = (1,2,3)
a = np.asarray(x)
print(a)
[1 2 3]
# ndarray from list of tuples
x = [(1,2,3),(4,5)]
a = np.asarray(x)
print(a)
[(1, 2, 3) (4, 5)]
<ipython-input-264-82b15a9a7503>:3: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. a = np.asarray(x)
numpy.frombuffer¶
- This function interprets a buffer as one-dimensional array. Any object that exposes the buffer interface is used as parameter to return an ndarray.
numpy.frombuffer(buffer, dtype = float, count = -1, offset = 0)
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| buffer | Any object that exposes buffer interface |
| dtype | Data type of returned ndarray. Defaults to float |
| count | The number of items to read, default -1 means all data |
| offset | The starting position to read from. Default is 0 |
l = b'hello world'
print(type(l))
a = np.frombuffer(l, dtype = "S1")
print(a)
print(type(a))
<class 'bytes'> [b'h' b'e' b'l' b'l' b'o' b' ' b'w' b'o' b'r' b'l' b'd'] <class 'numpy.ndarray'>
numpy.fromiter¶
- This function builds an ndarray object from any iterable object. A new one-dimensional array is returned by this function.
numpy.fromiter(iterable, dtype, count = -1)
- Here, the constructor takes the following parameters.
| Parameter | Description |
|---|---|
| iterable | Any iterable object |
| dtype | Data type of resultant array |
| count | The number of items to be read from iterator. Default is -1 which means all data to be read |
- The following examples show how to use the built-in range() function to return a list object. An iterator of this list is used to form an ndarray object.
# create list object using range function
list = range(5)
print(list)
range(0, 5)
# obtain iterator object from list
list = range(5)
it = iter(list)
# use iterator to create ndarray
x = np.fromiter(it, dtype = float)
print(x)
[0. 1. 2. 3. 4.]
NumPy - Array From Numerical Ranges¶
numpy.arange¶
- This function returns an ndarray object containing evenly spaced values within a given range.
numpy.arange(start, stop, step, dtype)
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| start | The start of an interval. If omitted, defaults to 0 |
| stop | The end of an interval (not including this number) |
| step | Spacing between values, default is 1 |
| dtype | Data type of resulting ndarray. \If not given, data type of input is used |
x = np.arange(5)
print(x)
# dtype set
x = np.arange(5, dtype = float)
print(x)
[0 1 2 3 4] [0. 1. 2. 3. 4.]
## start and stop parameters set
x = np.arange(10,20,2)
print(x)
[10 12 14 16 18]
numpy.linspace¶
- This function is similar to arange() function. In this function, instead of step size, the number of evenly spaced values between the interval is specified.
numpy.linspace(start, stop, num, endpoint, retstep, dtype)
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| start | The starting value of the sequence |
| stop | The end value of the sequence, included in the sequence if endpoint set to true |
| num | The number of evenly spaced samples to be generated. Default is 50 |
| endpoint | True by default, hence the stop value is included in the sequence. If false, it is not included |
| retstep | If true, returns samples and step between the consecutive numbers |
| dtype | Data type of output ndarray |
x = np.linspace(10,20,5)
print(x)
[10. 12.5 15. 17.5 20. ]
# endpoint set to false
x = np.linspace(10,20, 5, endpoint = False)
print(x)
[10. 12. 14. 16. 18.]
# find retstep value
x = np.linspace(1,2,5, retstep = True)
print(x)
# retstep here is 0.25
(array([1. , 1.25, 1.5 , 1.75, 2. ]), 0.25)
numpy.logspace¶
- This function returns an ndarray object that contains the numbers that are evenly spaced on a log scale. Start and stop endpoints of the scale are indices of the base, usually 10.
numpy.logspace(start, stop, num, endpoint, base, dtype)
- Following parameters determine the output of logspace function.
| Parameter | Description |
|---|---|
| start | The starting point of the sequence is basestart |
| stop | The final value of sequence is basestop |
| num | The number of values between the range. Default is 50 |
| endpoint | If true, stop is the last value in the range |
| base | Base of log space, default is 10 |
| dtype | Data type of output array. If not given, it depends upon other input arguments |
# default base is 10
a = np.logspace(1.0, 2.0, num = 10)
print(a)
[ 10. 12.91549665 16.68100537 21.5443469 27.82559402 35.93813664 46.41588834 59.94842503 77.42636827 100. ]
# set base of log space to 2
a = np.logspace(1,10,num = 10, base = 2)
print(a)
[ 2. 4. 8. 16. 32. 64. 128. 256. 512. 1024.]
NumPy - Array Manipulation¶
Changing Shape¶
| Shape | Description |
|---|---|
| reshape | Gives a new shape to an array without changing its data |
| flat | A 1-D iterator over the array |
| flatten | Returns a copy of the array collapsed into one dimension |
| ravel | Returns a contiguous flattened array |
numpy.reshape¶
- This function gives a new shape to an array without changing the data. It accepts the following parameters
numpy.reshape(arr, newshape, order')
| Parameter | Description |
|---|---|
| arr | Array to be reshaped |
| newshape | int or tuple of int. New shape should be compatible to the original shape |
| order | 'C' for C style, 'F' for Fortran style, 'A' means Fortran like order if an array is stored in Fortran-like contiguous memory, C style otherwise |
a = np.arange(8)
print('The original array:')
print(a)
print('\n')
b = a.reshape(4,2)
print('The modified array:')
print(b)
The original array: [0 1 2 3 4 5 6 7] The modified array: [[0 1] [2 3] [4 5] [6 7]]
numpy.ndarray.flat¶
- This function returns a 1-D iterator over the array. It behaves similar to Python's built-in iterator.
a = np.arange(8).reshape(2,4)
print('The original array:')
print(a)
print('\n')
print('After applying the flat function:')
# returns element corresponding to index in flattened array
print(a.flat[5])
The original array: [[0 1 2 3] [4 5 6 7]] After applying the flat function: 5
numpy.ndarray.flatten¶
- This function returns a copy of an array collapsed into one dimension. The function takes the following parameters.
ndarray.flatten(order)
| Parameter | Description |
|---|---|
| order | 'C'− row major (default. 'F': column major 'A': flatten in column-major order, if a is Fortran contiguous in memory, row-major order otherwise 'K': flatten a in the order the elements occur in the memory |
a = np.arange(8).reshape(2,4)
print('The original array is:')
print(a)
print('\n')
# default is column-major
print('The flattened array is:')
print(a.flatten())
print('\n')
print('The flattened array in F-style ordering:')
print(a.flatten(order = 'F'))
The original array is: [[0 1 2 3] [4 5 6 7]] The flattened array is: [0 1 2 3 4 5 6 7] The flattened array in F-style ordering: [0 4 1 5 2 6 3 7]
numpy.ravel¶
- This function returns a flattened one-dimensional array. A copy is made only if needed. The returned array will have the same type as that of the input array.
numpy.ravel(a, order)
- The constructor takes the following parameters.
| Parameter | Description |
|---|---|
| order | 'C': row major (default. 'F': column major 'A': flatten in column-major order, if a is Fortran contiguous in memory, row-major order otherwise 'K': flatten a in the order the elements occur in the memory |
import numpy as np
a = np.arange(8).reshape(2,4)
print('The original array is:')
print(a)
print('\n')
print('After applying ravel function:')
print(a.ravel())
print('\n')
print('Applying ravel function in F-style ordering:')
print(a.ravel(order = 'F'))
The original array is: [[0 1 2 3] [4 5 6 7]] After applying ravel function: [0 1 2 3 4 5 6 7] Applying ravel function in F-style ordering: [0 4 1 5 2 6 3 7]
Transpose Operations¶
| Operation | Description |
|---|---|
| transpose | Permutes the dimensions of an array |
| ndarray.T | Same as self.transpose() |
| rollaxis | Rolls the specified axis backwards |
| swapaxes | Interchanges the two axes of an array |
numpy.transpose¶
- This function permutes the dimension of the given array. It returns a view wherever possible.
numpy.transpose(arr, axes)
| Parameter | Description |
|---|---|
| arr | The array to be transposed |
| axes | List of ints, corresponding to the dimensions. By default, the dimensions are reversed |
a = np.arange(12).reshape(3,4)
print('The original array is:')
print(a)
print('\n')
print('The transposed array is:')
print(np.transpose(a))
The original array is: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] The transposed array is: [[ 0 4 8] [ 1 5 9] [ 2 6 10] [ 3 7 11]]
numpy.ndarray.T¶
- This function belongs to ndarray class. It behaves similar to numpy.transpose.
a = np.arange(12).reshape(3,4)
print('The original array is:')
print(a)
print('\n')
print('Array after applying the function:')
print(a.T)
The original array is: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] Array after applying the function: [[ 0 4 8] [ 1 5 9] [ 2 6 10] [ 3 7 11]]
numpy.rollaxis¶
- This function rolls the specified axis backwards, until it lies in a specified position.
numpy.rollaxis(arr, axis, start)
| Parameter | Description |
|---|---|
| arr | Input array |
| axis | Axis to roll backwards. The position of the other axes do not change relative to one another |
| start | Zero by default leading to the complete roll. Rolls until it reaches the specified position |
# It creates 3 dimensional ndarray
a = np.arange(8).reshape(2,2,2)
print('The original array:')
print(a)
print('\n')
# to roll axis-2 to axis-0 (along width to along depth)
print('After applying rollaxis function:')
print(np.rollaxis(a,2))
# to roll axis 0 to 1 (along width to height)
print('\n')
print('After applying rollaxis function:')
print(np.rollaxis(a,2,1))
The original array: [[[0 1] [2 3]] [[4 5] [6 7]]] After applying rollaxis function: [[[0 2] [4 6]] [[1 3] [5 7]]] After applying rollaxis function: [[[0 2] [1 3]] [[4 6] [5 7]]]
numpy.swapaxes¶
- This function interchanges the two axes of an array. For NumPy versions after 1.10, a view of the swapped array is returned.
numpy.swapaxes(arr, axis1, axis2)
| Parameter | Description |
|---|---|
| arr | Input array whose axes are to be swapped |
| axis1 | An int corresponding to the first axis |
| axis2 | An int corresponding to the second axis |
# It creates a 3 dimensional ndarray
a = np.arange(8).reshape(2,2,2)
print('The original array:')
print(a)
print('\n')
# now swap numbers between axis 0 (along depth) and axis 2 (along width)
print('The array after applying the swapaxes function:')
print(np.swapaxes(a, 2, 0))
The original array: [[[0 1] [2 3]] [[4 5] [6 7]]] The array after applying the swapaxes function: [[[0 4] [2 6]] [[1 5] [3 7]]]
Changing Dimensions¶
| Dimension | Description |
|---|---|
| broadcast | Produces an object that mimics broadcasting |
| broadcast_to | Broadcasts an array to a new shape |
| expand_dims | Expands the shape of an array |
| squeeze | Removes single-dimensional entries from the shape of an array |
numpy.broadcast¶
As seen earlier, NumPy has in-built support for broadcasting. This function mimics the broadcasting mechanism. It returns an object that encapsulates the result of broadcasting one array against the other.
The function takes two arrays as input parameters.
x = np.array([[1], [2], [3]])
y = np.array([4, 5, 6])
# tobroadcast x against y
b = np.broadcast(x,y)
# it has an iterator property, a tuple of iterators along self's "components."
print('Broadcast x against y:')
r,c = b.iters
print(next(r), next(c))
print(next(r), next(c))
print('\n')
# shape attribute returns the shape of broadcast object
print('The shape of the broadcast object:')
print(b.shape)
print('\n')
# to add x and y manually using broadcast
b = np.broadcast(x,y)
c = np.empty(b.shape)
print('Add x and y manually using broadcast:')
print(c.shape)
print('\n')
c.flat = [u + v for (u,v) in b]
print('After applying the flat function:')
print(c)
print('\n')
# same result obtained by NumPy's built-in broadcasting support
print('The summation of x and y:')
print(x + y)
Broadcast x against y: 1 4 1 5 The shape of the broadcast object: (3, 3) Add x and y manually using broadcast: (3, 3) After applying the flat function: [[5. 6. 7.] [6. 7. 8.] [7. 8. 9.]] The summation of x and y: [[5 6 7] [6 7 8] [7 8 9]]
numpy.broadcast_to¶
-This function broadcasts an array to a new shape. It returns a read-only view on the original array. It is typically not contiguous. The function may throw ValueError if the new shape does not comply with NumPy's broadcasting rules.
- Note : This function is available version 1.10.0 onwards.
numpy.broadcast_to(array, shape, subok)
a = np.arange(4).reshape(1,4)
print('The original array:')
print(a)
print('\n')
print('After applying the broadcast_to function:')
print(np.broadcast_to(a,(4,4)))
The original array: [[0 1 2 3]] After applying the broadcast_to function: [[0 1 2 3] [0 1 2 3] [0 1 2 3] [0 1 2 3]]
numpy.expand_dims¶
- This function expands the array by inserting a new axis at the specified position. Two parameters are required by this function.
numpy.expand_dims(arr, axis)
| Parameter | Description |
|---|---|
| arr | Input array |
| axis | Position where new axis to be inserted |
x = np.array(([1,2],[3,4]))
print('Array x:')
print(x)
print('\n')
y = np.expand_dims(x, axis = 0)
print('Array y:')
print(y)
print('\n')
print('The shape of X and Y array:')
print(x.shape, y.shape)
print('\n')
# insert axis at position 1
y = np.expand_dims(x, axis = 1)
print('Array Y after inserting axis at position 1:')
print(y)
print('\n')
print('x.ndim and y.ndim:')
print(x.ndim,y.ndim)
print('\n')
print('x.shape and y.shape:')
print(x.shape, y.shape)
Array x: [[1 2] [3 4]] Array y: [[[1 2] [3 4]]] The shape of X and Y array: (2, 2) (1, 2, 2) Array Y after inserting axis at position 1: [[[1 2]] [[3 4]]] x.ndim and y.ndim: 2 3 x.shape and y.shape: (2, 2) (2, 1, 2)
numpy.squeeze¶
- This function removes one-dimensional entry from the shape of the given array.
numpy.squeeze(arr, axis)
| Parameter | Description |
|---|---|
| arr | Input array |
| axis | int or tuple of int. selects a subset of single dimensional entries in the shape |
x = np.arange(9).reshape(1,3,3)
print('Array X:')
print(x)
print('\n')
y = np.squeeze(x)
print('Array Y:')
print(y)
print('\n')
print('The shapes of X and Y array:')
print(x.shape, y.shape)
Array X: [[[0 1 2] [3 4 5] [6 7 8]]] Array Y: [[0 1 2] [3 4 5] [6 7 8]] The shapes of X and Y array: (1, 3, 3) (3, 3)
Joining Arrays¶
| Array | Description |
|---|---|
| concatenate | Joins a sequence of arrays along an existing axis |
| stack | Joins a sequence of arrays along a new axis |
| hstack | Stacks arrays in sequence horizontally (column wise) |
| vstack | Stacks arrays in sequence vertically (row wise) |
numpy.concatenate¶
- Concatenation refers to joining. This function is used to join two or more arrays of the same shape along a specified axis.
numpy.concatenate((a1, a2, ...), axis)
| Parameter | Description |
|---|---|
| a1,a2.. | Sequence of arrays of the same type |
| axis | Axis along which arrays have to be joined. Default is 0 |
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
# both the arrays are of same dimensions
print('Joining the two arrays along axis 0:')
print(np.concatenate((a,b)))
print('\n')
print('Joining the two arrays along axis 1:')
print(np.concatenate((a,b),axis = 1))
Joining the two arrays along axis 0: [[1 2] [3 4] [5 6] [7 8]] Joining the two arrays along axis 1: [[1 2 5 6] [3 4 7 8]]
numpy.stack¶
- This function joins the sequence of arrays along a new axis.
numpy.stack(arrays, axis)
| Parameter | Description |
|---|---|
| arrays | Sequence of arrays of the same shape |
| axis | Axis in the resultant array along which the input arrays are stacked |
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
print('Stack the two arrays along axis 0:')
print(np.stack((a,b),0))
print('\n')
print('Stack the two arrays along axis 1:')
print(np.stack((a,b),1))
Stack the two arrays along axis 0: [[[1 2] [3 4]] [[5 6] [7 8]]] Stack the two arrays along axis 1: [[[1 2] [5 6]] [[3 4] [7 8]]]
numpy.hstack¶
- Variants of numpy.stack function to stack so as to make a single array horizontally.
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
print('Horizontal stacking:')
c = np.hstack((a,b))
print(c)
Horizontal stacking: [[1 2 5 6] [3 4 7 8]]
numpy.vstack¶
- Variants of numpy.stack function to stack so as to make a single array vertically.
a = np.array([[1,2],[3,4]])
b = np.array([[5,6],[7,8]])
print('Vertical stacking:')
c = np.vstack((a,b))
print(c)
Vertical stacking: [[1 2] [3 4] [5 6] [7 8]]
Splitting Arrays¶
| Array | Description |
|---|---|
| split | Splits an array into multiple sub-arrays |
| hsplit | Splits an array into multiple sub-arrays horizontally (column-wise) |
| vsplit | Splits an array into multiple sub-arrays vertically (row-wise) |
numpy.split¶
- This function divides the array into subarrays along a specified axis.
numpy.split(ary, indices_or_sections, axis)
| Parameter | Description |
|---|---|
| ary | Input array to be split |
| indices_or_sections | Can be an integer, indicating the number of equal sized subarrays to be created from the input array. If this parameter is a 1-D array, the entries indicate the points at which a new subarray is to be created. |
| axis | Default is 0 |
import numpy as np
a = np.arange(9)
print('First array:')
print(a)
print('\n')
print('Split the array in 3 equal-sized subarrays:')
b = np.split(a,3)
print(b)
print('\n')
print('Split the array at positions indicated in 1-D array:')
b = np.split(a,[4,7])
print(b)
First array: [0 1 2 3 4 5 6 7 8] Split the array in 3 equal-sized subarrays: [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])] Split the array at positions indicated in 1-D array: [array([0, 1, 2, 3]), array([4, 5, 6]), array([7, 8])]
numpy.hsplit¶
- The numpy.hsplit is a special case of split() function where axis is 1 indicating a horizontal split regardless of the dimension of the input array.
a = np.arange(16).reshape(4,4)
print('First array:')
print(a)
print('\n')
print('Horizontal splitting:')
b =np.hsplit(a,2)
print(b)
First array:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
Horizontal splitting:
[array([[ 0, 1],
[ 4, 5],
[ 8, 9],
[12, 13]]), array([[ 2, 3],
[ 6, 7],
[10, 11],
[14, 15]])]
numpy.vsplit¶
- numpy.vsplit is a special case of split() function where axis is 1 indicating a vertical split regardless of the dimension of the input array.
a = np.arange(16).reshape(4,4)
print('First array:')
print(a)
print('\n')
print('Vertical splitting:')
b = np.vsplit(a,2)
print(b)
First array:
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
Vertical splitting:
[array([[0, 1, 2, 3],
[4, 5, 6, 7]]), array([[ 8, 9, 10, 11],
[12, 13, 14, 15]])]
Adding / Removing Elements¶
| Element | Description |
|---|---|
| resize | Returns a new array with the specified shape |
| append | Appends the values to the end of an array |
| insert | Inserts the values along the given axis before the given indices |
| delete | Returns a new array with sub-arrays along an axis deleted |
| unique | Finds the unique elements of an array |
numpy.resize¶
- This function returns a new array with the specified size. If the new size is greater than the original, the repeated copies of entries in the original are contained.
numpy.resize(arr, shape)
| Parameter | Description |
|---|---|
| arr | Input array to be resized |
| shape | New shape of the resulting array |
a = np.array([[1,2,3],[4,5,6]])
print('First array:')
print(a)
print('\n')
print('The shape of first array:')
print(a.shape)
print('\n')
b = np.resize(a, (3,2))
print('Second array:')
print(b)
print('\n')
print('The shape of second array:')
print(b.shape)
print('\n')
# Observe that first row of a is) repeated in b since size is bigger
print('Resize the second array:')
b = np.resize(a,(3,3))
print(b)
First array: [[1 2 3] [4 5 6]] The shape of first array: (2, 3) Second array: [[1 2] [3 4] [5 6]] The shape of second array: (3, 2) Resize the second array: [[1 2 3] [4 5 6] [1 2 3]]
numpy.append¶
- This function adds values at the end of an input array. The append operation is not inplace, a new array is allocated. Also the dimensions of the input arrays must match otherwise ValueError will be generated.
numpy.append(arr, values, axis)
| Parameter | Description |
|---|---|
| arr | Input array |
| values | To be appended to arr. It must be of the same shape as of arr (excluding axis of appending) |
| axis | The axis along which append operation is to be done. If not given, both parameters are flattened |
a = np.array([[1,2,3],[4,5,6]])
print('Append elements to array:')
print(np.append(a, [7,8,9]))
print('\n')
print('Append elements along axis 0:')
print(np.append(a, [[7,8,9]],axis = 0))
print('\n')
print('Append elements along axis 1:')
print(np.append(a, [[5,5,5],[7,8,9]],axis = 1))
Append elements to array: [1 2 3 4 5 6 7 8 9] Append elements along axis 0: [[1 2 3] [4 5 6] [7 8 9]] Append elements along axis 1: [[1 2 3 5 5 5] [4 5 6 7 8 9]]
numpy.insert¶
- This function inserts values in the input array along the given axis and before the given index. If the type of values is converted to be inserted, it is different from the input array. Insertion is not done in place and the function returns a new array. Also, if the axis is not mentioned, the input array is flattened.
numpy.insert(arr, obj, values, axis)
| Parameter | Description |
|---|---|
| arr | Input array |
| obj | The index before which insertion is to be made |
| values | The array of values to be inserted |
| axis | The axis along which to insert. If not given, the input array is flattened |
a = np.array([[1,2],[3,4],[5,6]])
print('Axis parameter not passed. The input array is flattened before insertion.')
print(np.insert(a,3,[11,12]))
print('\n')
print('Axis parameter passed. The values array is broadcast to match input array.')
print('Broadcast along axis 0:')
print(np.insert(a,1,[11],axis = 0))
print('\n')
print('Broadcast along axis 1:')
print(np.insert(a,1,11,axis = 1))
Axis parameter not passed. The input array is flattened before insertion. [ 1 2 3 11 12 4 5 6] Axis parameter passed. The values array is broadcast to match input array. Broadcast along axis 0: [[ 1 2] [11 11] [ 3 4] [ 5 6]] Broadcast along axis 1: [[ 1 11 2] [ 3 11 4] [ 5 11 6]]
numpy.delete¶
- This function returns a new array with the specified subarray deleted from the input array. As in case of insert() function, if the axis parameter is not used, the input array is flattened.
Numpy.delete(arr, obj, axis)
| Parameter | Description |
|---|---|
| arr | Input array |
| obj | Can be a slice, an integer or array of integers, indicating the subarray to be deleted from the input array |
| axis | The axis along which to delete the given subarray. If not given, arr is flattened |
a = np.arange(12).reshape(3,4)
print('First array:')
print(a)
print('\n')
print('Array flattened before delete operation as axis not used:')
print(np.delete(a,5))
print('\n')
print('Column 2 deleted:')
print(np.delete(a,1,axis = 1))
print('\n')
print('A slice containing alternate values from array deleted:')
a = np.array([1,2,3,4,5,6,7,8,9,10])
print(np.delete(a, np.s_[::2]))
First array: [[ 0 1 2 3] [ 4 5 6 7] [ 8 9 10 11]] Array flattened before delete operation as axis not used: [ 0 1 2 3 4 6 7 8 9 10 11] Column 2 deleted: [[ 0 2 3] [ 4 6 7] [ 8 10 11]] A slice containing alternate values from array deleted: [ 2 4 6 8 10]
numpy.unique¶
- This function returns an array of unique elements in the input array. The function can be able to return a tuple of array of unique vales and an array of associated indices. Nature of the indices depend upon the type of return parameter in the function call.
numpy.unique(arr, return_index, return_inverse, return_counts)
`
| Parameter | Description |
|---|---|
| arr | The input array. Will be flattened if not 1-D array |
| return_index | If True, returns the indices of elements in the input array |
| return_inverse | If True, returns the indices of unique array, which can be used to reconstruct the input array |
| return_counts | If True, returns the number of times the element in unique array appears in the original array |
a = np.array([5,2,6,2,7,5,6,8,2,9])
print('Unique values of first array:')
u = np.unique(a)
print(u)
print('\n')
print('Unique array and Indices array:')
u,indices = np.unique(a, return_index = True)
print(indices)
print('\n')
print('We can see each number corresponds to index in original array:')
print(a)
print('\n')
print('Indices of unique array:')
u,indices = np.unique(a,return_inverse = True)
print(u)
print('\n')
print('Indices are:')
print(indices)
print('\n')
print('Reconstruct the original array using indices:')
print(u[indices])
print('\n')
print('Return the count of repetitions of unique elements:')
u,indices = np.unique(a,return_counts = True)
print(u)
print(indices)
Unique values of first array: [2 5 6 7 8 9] Unique array and Indices array: [1 0 2 4 7 9] We can see each number corresponds to index in original array: [5 2 6 2 7 5 6 8 2 9] Indices of unique array: [2 5 6 7 8 9] Indices are: [1 0 2 0 3 1 2 4 0 5] Reconstruct the original array using indices: [5 2 6 2 7 5 6 8 2 9] Return the count of repetitions of unique elements: [2 5 6 7 8 9] [3 2 2 1 1 1]
NumPy - Binary Operators¶
| Operation | Description |
|---|---|
| bitwise_and | Computes bitwise AND operation of array elements |
| bitwise_or | Computes bitwise OR operation of array elements |
| invert | Computes bitwise NOT |
| left_shift | Shifts bits of a binary representation to the left |
| right_shift | Shifts bits of binary representation to the right |
NumPy - bitwise_and¶
- The bitwise AND operation on the corresponding bits of binary representations of integers in input arrays is computed by np.bitwise_and() function.
print('Binary equivalents of 13 and 17:')
a,b = 13,17
print(bin(a), bin(b))
print('\n')
print('Bitwise AND of 13 and 17:')
print(np.bitwise_and(13, 17))
Binary equivalents of 13 and 17: 0b1101 0b10001 Bitwise AND of 13 and 17: 1
NumPy - bitwise_or¶
- The bitwise OR operation on the corresponding bits of binary representations of integers in input arrays is computed by np.bitwise_or() function.
a,b = 13,17
print('Binary equivalents of 13 and 17:')
print(bin(a), bin(b))
print('Bitwise OR of 13 and 17:')
print(np.bitwise_or(13, 17))
Binary equivalents of 13 and 17: 0b1101 0b10001 Bitwise OR of 13 and 17: 29
numpy.invert()¶
This function computes the bitwise NOT result on integers in the input array. For signed integers, two's complement is returned.
Note that np.binary_repr() function returns the binary representation of the decimal number in the given width.
print('Invert of 13 where dtype of ndarray is uint8:')
print(np.invert(np.array([13], dtype = np.uint8)))
print('\n')
# Comparing binary representation of 13 and 242, we find the inversion of bits
print('Binary representation of 13:')
print(np.binary_repr(13, width = 8))
print('\n')
print('Binary representation of 242:')
print(np.binary_repr(242, width = 8))
Invert of 13 where dtype of ndarray is uint8: [242] Binary representation of 13: 00001101 Binary representation of 242: 11110010
NumPy - left_shift¶
- The numpy.left_shift() function shifts the bits in binary representation of an array element to the left by specified positions. Equal number of 0s are appended from the right.
print('Left shift of 10 by two positions:')
print(np.left_shift(10,2))
print('\n')
print('Binary representation of 10:')
print(np.binary_repr(10, width = 8))
print('\n')
print('Binary representation of 40:')
print(np.binary_repr(40, width = 8))
# Two bits in '00001010' are shifted to left and two 0s appended from right.
Left shift of 10 by two positions: 40 Binary representation of 10: 00001010 Binary representation of 40: 00101000
NumPy - right_shift¶
- The numpy.right_shift() function shift the bits in the binary representation of an array element to the right by specified positions, and an equal number of 0s are appended from the left.
print('Right shift 40 by two positions:')
print(np.right_shift(40,2))
print('\n')
print('Binary representation of 40:')
print(np.binary_repr(40, width = 8))
print('\n')
print('Binary representation of 10')
print(np.binary_repr(10, width = 8))
# Two bits in '00001010' are shifted to right and two 0s appended from left.
Right shift 40 by two positions: 10 Binary representation of 40: 00101000 Binary representation of 10 00001010
NumPy - String Functions¶
- These functions are defined in character array class (numpy.char). The older Numarray package contained chararray class. The above functions in numpy.char class are useful in performing vectorized string operations.
| Function | Description |
|---|---|
| add() | Returns element-wise string concatenation for two arrays of str or Unicode |
| multiply() | Returns the string with multiple concatenation, element-wise |
| center() | Returns a copy of the given string with elements centered in a string of specified length |
| capitalize() | Returns a copy of the string with only the first character capitalized |
| title() | Returns the element-wise title cased version of the string or unicode |
| lower() | Returns an array with the elements converted to lowercase |
| upper() | Returns an array with the elements converted to uppercase |
| split() | Returns a list of the words in the string, using separatordelimiter |
| splitlines() | Returns a list of the lines in the element, breaking at the line boundaries |
| strip() | Returns a copy with the leading and trailing characters removed |
| join() | Returns a string which is the concatenation of the strings in the sequence |
| replace() | Returns a copy of the string with all occurrences of substring replaced by the new string |
| decode() | Calls str.decode element-wise |
| encode() | Calls str.encode element-wise |
- These functions are defined in character array class (numpy.char). The older Numarray package contained chararray class. The above functions in numpy.char class are useful in performing vectorized string operations.
numpy.char.add()¶
print('Concatenate two strings:')
print(np.char.add(['hello'],[' xyz']))
print('\n')
print('Concatenation example:')
print(np.char.add(['hello', 'hi'],[' abc', ' xyz']))
Concatenate two strings: ['hello xyz'] Concatenation example: ['hello abc' 'hi xyz']
numpy.char.multiply()¶
# This function performs multiple concatenation.
print(np.char.multiply('Hello ',3))
Hello Hello Hello
numpy.char.center()¶
#This function returns an array of the required width so that the input string is centered and padded on the left and right with fillchar.
#np.char.center(arr, width,fillchar)
print(np.char.center('hello', 20,fillchar = '*'))
*******hello********
numpy.char.capitalize()¶
# This function returns the copy of the string with the first letter capitalized.
print(np.char.capitalize('hello world'))
Hello world
numpy.char.title()¶
#This function returns a title cased version of the input string with the first letter of each word capitalized.
print(np.char.title('hello how are you?'))
Hello How Are You?
numpy.char.lower()¶
# This function returns an array with elements converted to lowercase. It calls str.lower for each element.
print(np.char.lower(['HELLO','WORLD']))
print(np.char.lower('HELLO'))
['hello' 'world'] hello
numpy.char.upper()¶
#This function calls str.upper function on each element in an array to return the uppercase array elements.
print(np.char.upper('hello'))
print(np.char.upper(['hello','world']))
HELLO ['HELLO' 'WORLD']
numpy.char.split()¶
# This function returns a list of words in the input string. By default, a whitespace is used as a separator. Otherwise the specified separator character is used to spilt the string.
print(np.char.split ('hello how are you?'))
print(np.char.split ('TutorialsPoint,Hyderabad,Telangana', sep = ','))
['hello', 'how', 'are', 'you?'] ['TutorialsPoint', 'Hyderabad', 'Telangana']
numpy.char.splitlines()¶
# This function returns a list of elements in the array, breaking at line boundaries.
print(np.char.splitlines('hello\nhow are you?'))
print(np.char.splitlines('hello\rhow are you?'))
# '\n', '\r', '\r\n' can be used as line boundaries.
['hello', 'how are you?'] ['hello', 'how are you?']
numpy.char.strip()¶
# This function returns a copy of array with elements stripped of the specified characters leading and/or trailing in it.
print(np.char.strip('ashok arora','a'))
print(np.char.strip(['arora','admin','java'],'a'))
shok aror ['ror' 'dmin' 'jav']
numpy.char.join()¶
# This method returns a string in which the individual characters are joined by separator character specified.
print(np.char.join(':','dmy'))
print(np.char.join([':','-'],['dmy','ymd']))
d:m:y ['d:m:y' 'y-m-d']
numpy.char.replace()¶
# This function returns a new copy of the input string in which all occurrences of the sequence of characters is replaced by another given sequence.
print(np.char.replace ('He is a good boy', 'is', 'was'))
He was a good boy
numpy.char.decode()¶
# This function calls numpy.char.decode() decodes the given string using the specified codec.
a = np.char.encode('hello', 'cp500')
print(a)
print(np.char.decode(a,'cp500'))
b'\x88\x85\x93\x93\x96' hello
numpy.char.encode()¶
#This function calls str.encode function for each element in the array. Default encoding is utf_8, codecs available in standard Python library may be used.
a = np.char.encode('hello', 'cp500')
print(a)
b'\x88\x85\x93\x93\x96'
NumPy - Byte Swapping¶
- We have seen that the data stored in the memory of a computer depends on which architecture the CPU uses. It may be little-endian (least significant is stored in the smallest address) or big-endian (most significant byte in the smallest address).
numpy.ndarray.byteswap()¶
- The numpy.ndarray.byteswap() function toggles between the two representations: bigendian and little-endian.
a = np.array([1, 256, 8755], dtype = np.int16)
print('Representation of data in memory in hexadecimal form:')
print(map(hex,a))
# byteswap() function swaps in place by passing True parameter
print('Applying byteswap() function:')
print(a.byteswap(True))
print('In hexadecimal form:')
print(map(hex,a))
# We can see the bytes being swapped
#output
#Representation of data in memory in hexadecimal form:
#['0x1', '0x100', '0x2233']
#Applying byteswap() function:
#[256 1 13090]
#In hexadecimal form:
#['0x100', '0x1', '0x3322']
Representation of data in memory in hexadecimal form: <map object at 0x7d729a70f010> Applying byteswap() function: [ 256 1 13090] In hexadecimal form: <map object at 0x7d729a7dc910>
