Getting Started With Numpy


You all know that Python is a great and easy to learn programming language, but with the help of a few popular libraries (numpy, scipy, matplotlib) it becomes a powerful environment for scientific computing.

In this tutorial, we will be focussing on the basics of numpy package.

Note – Before reading this tutorial I strongly recommend to have a basic knowledge in python

 

1. What is numpy?

Numpy is a Python library that provides a multidimensional array object, various derived objects, and a collection of routines for fast operations on arrays, including mathematical, logical, shape manipulation, sorting, reshaping, accessing etc., Numerical + Python = Numpy.

 

2. Differences between NumPy arrays and Python List

 

3. Is NumPy Fast?

The reasons why numpy is fast is given below

  • Faster to read less bytes of memory
  • No type checking when iterating elements
 

Vectorization

  1. They are optimized in the backend written in C. Their advantages are
    1. Vectorized code is more concise and easier to read which more likely resembles standard mathematical notation
    2. Vectorization results in more “Pythonic” code. Without vectorization, our code would be littered with inefficient and difficult to read for loops

Broadcasting

Broadcasting is the term used to describe the implicit element-by-element behavior of operations; generally speaking  in this all operations  not just arithmetic operations but logical, bit-wise, functional, etc., behave in this implicit element-by-element fashion, i.e., they broadcast.

As we know computers take binary representations, let us see how is number stored in both python and numpy.

The above diagram shows:

Consider a number 5 which is to be stored, computer reads it as byte which is 8 bit.
So 5 becomes 00000101 and now comes the difference. In numpy, it is stored in continuous memory locations as 32 bit number or 64 or 16 which can be changed by us. But in other hand list, it stores 4 parts the size, reference count, object type and value which makes difficult to calculate each time. This is the result of why numpy takes less time than list

4. Advantages

The above diagram shows numbers in the list are stores in non continuous memory source while in np arrays they are stored in contiguous memory which makes efficient access and faster

5. Why numpy?

  • Powerful n-dimensional arrays – Fast and versatile, the vectorization, indexing, and broadcasting concepts backbone of array computing today.
  • Numerical computing tools – It offers functions like random number generators, linear algebra routines, Fourier transforms, and more.
  • Interoperable – It supports a wide range of hardware and computing platforms, performs well with distributed, GPU, and sparse array.
  • Performance – It is a well-optimized C code. So it is very much faster.
  • Easy to use– High level syntax makes it accessible and productive for programmers from any background or experience level.
  • Open source – NumPy is developed and maintained publicly on GitHub by a vibrant, responsive, and diverse community

6. Need of numpy

Let us consider a list, list1 = [1,2,3], suppose consider that you need to add 5 to all elements present in the list, you will do that by looping through the list and add 5 to each element which is a time consuming process. In order to faster, we use numpy arrays

1000
1000
Time taken in python list 52.18982696533203
Time taken in numpy list 187.0870590209961


7. Applications

  • Mathematics
  • Plotting in matplotlib
  • Backend connections like pandas
  • Base for ML
  • Broadcasting helps in Deep learning process faster

Installation

The best way to install this library on your system is by using a pre-built package for your operating system. You can install the package by providing – pip install numpy

You can import the installed package by providing – Import numpy as np

Now let’s start coding and explore the basics

8. Array – basics

In this topic we are going to learn the basics of numpy arrays

Initialize an integer array – 1d

One dimensional array can be defined as following, note that the elements should be given in list and should be of homogenous type

[1 2 3]

Initialize a float array – 2d

Two dimensional arrays can be created as following by giving comma at the end of a list

[[1. 2. 3.]
 [4. 5. 6.]]


Get dimensions of the array

To get the dimensions of the arrays, we can use ndim

1
2

Get shape of an array

The shape function returns the shape of the array, for example (3,4) denotes that the array has 3 rows and 4 columns

(3,)
(2, 3)

Get the data type of

To get the data type of the elements in the array, this function is used

int64
float64


To specify the data type while initializing

We can specify the data type of the elements while creating the array when we are really considered about the memory

int16

Get size of array

Total Size of an array or count

3
6

Get item size of the array

Item Size of an array

2
8

Total size = nbytes

Total size will be = number of item * size of one item or nbytes

6
48
6
48

Complex arrays

It supports complex number arrays to

[[1.+0.j 2.+0.j]
 [3.+0.j 4.+0.j]]

The below table summarises the function and its uses


9. Accessing specific element in an array

In this topic we will be concentrating on how to access an element, modify it

To get an specific element

In the array, to get a specific array element we can just access it by row number and column number index which is starting from 0, we can use negative index to come from reverse

12


To get an specific row

Get a specific row ( 0 denotes first row and all column values )

array([1, 3, 5, 7])

To get an specific column

array([1, 8])

The general formula to access an element from array is [startindex:endindex:stepsize]

9. Changing specific element in an array

You can just locate the index and modify the value, note that the shape must match

 
When wrong shape is given it shows an error
 
 

10. Initializing Different Types of Arrays

Common parameters :

Shape – int or tuple of ints Data-type, optional, Order {‘C’, ‘F’}, optional, default: ‘C’ (row major or column major)

Syntax Use
np.zeros

(shape, dtype=None, order=’C’)

np.zeros((2,2))
To initialize all array elements as 0
numpy.ones
np.ones((2,2,2), dtype=’int32′)

(shape, dtype=None, order=’C’)

To initialize all array elements as 1
numpy.full
np.full((2,2), 8)

(shape, fill_value, dtype=None, order=’C’)

To initialize all array elements of our own values
numpy.full_like
np.full_like(array, 4)

(afill_valuedtype=Noneorder=’K’subok=Trueshape=None)

Return a full array with the same shape and type as a given array.
numpy.empty
np.empty( (2,3)
(shapedtype=floatorder=’C’)
Return a new array of given shape and type, without initializing entries.
numpy.random.rand
np.random.rand(3,4)

d0,d1,d2…dn)

Returns a array with random numbers generated with float type
numpy.random.randint

np.random.randint(2,8, size=(3,3)(lowhigh=Nonesize=Nonedtype=int)

Return random integers from low (inclusive) to high (exclusive).
numpy.identity

np.identity(3)

(ndtype=None)
Return identity matrix
numpy.repeat

 (arepeatsaxis=None)

Repeat elements of an array.
 
[[0. 0.]
 [0. 0.]]
[[[1 1]
  [1 1]]

 [[1 1]
  [1 1]]]
[[8 8]
 [8 8]]
[[4 4 4 4]
 [4 4 4 4]]
[[1. 2. 3.]
 [4. 5. 6.]]
[[0.20060323 0.53350714 0.04559776 0.46512966]
 [0.40680087 0.80845959 0.28626885 0.62462556]
 [0.06047462 0.58131828 0.01010794 0.60903586]]

[[4 6 5]
 [5 4 2]
 [5 5 5]]
[[1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]]
[[1 2 3 4 5 6]
 [1 2 3 4 5 6]
 [1 2 3 4 5 6]]
[[1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6]]
 

11. Copy of array elements

There are basically two copies in numpy python

  1.  Shallow copy ( View )
  2.  Deep copy
[5 2 3]
[5 2 3]

[5 2 3]
[1 2 3]

12. Linear algebra operations

Matmul

When we need to multiply two matrices we can use matmul, with only condition of the column of first matrix must be equal to row of second matrix

[[7 4 2]
 [3 4 7]
 [3 2 3]] [[2 3 3]
 [6 2 2]
 [7 6 6]]
[[ 4.          4.45454545  4.45454545]
 [-8.5        -8.72727273 -8.72727273]
 [ 4.          3.36363636  3.36363636]]
[[-0.09090909 -0.36363636  0.90909091]
 [ 0.54545455  0.68181818 -1.95454545]
 [-0.27272727 -0.09090909  0.72727273]]
14
12.84523257866513
[11.67802104+0.j          1.16098948+0.73210946j  1.16098948-0.73210946j]
[[52 41 41]
 [79 59 59]
 [39 31 31]]


13. Re arranging arrays

[[10 20 30 40]
 [50 60 70 80]]
[[10 20 30 40]
 [50 60 70 80]]
[10 20 30 40 50 60 70 80]
[[10 50]
 [20 60]
 [30 70]
 [40 80]]
[[10 20 30 40 10 20 30 40]
 [50 60 70 80 50 60 70 80]]
[[1 2 3 4]
 [5 6 7 8]
 [1 2 3 4]
 [5 6 7 8]]
[[1. 1. 1. 1. 0. 0.]
 [1. 1. 1. 1. 0. 0.]]

14. Conclusion

Here comes the end of this tutorial. In this tutorial, we learned in detail the basics of working of this library with examples. Here we have also explored how to perform various operations via the NumPy library, which is most commonly used in many data science applications.

You can also check our post on Top languages for Data Science

Spread the knowledge

Aswath Rao

Currently pursuing Msc in Data Science

Leave a Reply

Your email address will not be published. Required fields are marked *