Team - GeekPython

Type Hinting in Python

Sachin Pal — Mon, 22 Apr 2024 11:02:31 GMT

Python is a dynamically typed language, meaning you do not need to specify the type of variables, parameters or return values. This is determined during program execution based on the values assigned to the variable or passed as the argument.

Python introduced type hints or static typing with version 3.5, allowing developers to declare the data type of variables, parameters, etc.

What is Type Hint?

x: int = 5y: str = "5"

In this example, the variable x expects an integer value while the variable y expects a string value.

This is called type hint or static typing where you specify the expected data type for a variable, parameter or return value of a function.

Python has a different way of declaring type hints for the variables, return values, collections, etc.

Consider the following example:

def circle(radius: float) -> str:    area = 3.14 * radius ** 2    return f"Area of circle: {area}"

In this example, the function circle accepts an argument radius which is expected to be a float value as indicated by type hint radius: float, and the return value of this function is expected to be a string, as indicated by the -> str hint.

Performing a Check

What if you pass the argument of a different type which is not expected? Consider the function below:

def date(dd: str, mm: str, yyyy: str) -> str:    return f"Current Date: {dd}-{mm}-{yyyy}"curr_date = date(22, 9, 2024)  # expected 'str' got 'int'print(curr_date)

The function date() accepts three arguments and expects all of them to be a string value, however, integer values are supplied when the function is called.

What do you think? Will this program throw an error? Well, it looks like but Python or specifically interpreter completely ignores the type hints as this is not its purpose.

Current Date: 22-9-2024

The type of arguments was decided in the runtime that is why no error is thrown, however, you may see a warning in your IDE or code editor.

Annotating Multiple Return Types

In this section, you'll learn how to annotate multiple return types for a single value of alternative types and multiple values of different types.

Alternative Types for a Return Value

def cart(item: str) -> str | None:    if item == "":        return None    return "Item added in the cart"

The function cart() accepts an argument item and returns None if it is not supplied, otherwise, it returns a string.

To represent multiple return types for a function, the (|) pipe can be used. It means either str or None. So, when you call the function, it indicates that the return value of the function can be a str or None.

You can also use typing.Union to accomplish the same task you've done using the (|).

from typing import Uniondef cart(item: str) -> Union[None, str]:    if item == "":        return None    return "Item added in the cart"

The Union[None, str] is equivalent to None | str. None | str is a shorthand and it is a recommended way.

Multiple Return Values of Different Types

def cart(item: str, quantity: int) -> dict[str, int] | None:    if item == "" or quantity == 0:        return None    return {item: quantity}

The function cart() has been updated to accept an additional argument quantity and returns None if any of the arguments are left empty, otherwise, it returns a dictionary containing a key (str) and value (int).

If you look at the type hint for the return value, this time different types (str and int) are expected to return in a dictionary (dict).

You can do this in the following way.

from typing import Union, Dictdef cart(item: str, quantity: int) -> Union[Dict[str, int], None]:    if item == "" or quantity == 0:        return None    return {item: quantity}

You can use Mapping instead of Dict to represent the dictionary.

from typing import Union, Mappingdef cart(item: str, quantity: int) -> Union[Mapping[str, int], None]:    if item == "" or quantity == 0:        return None    return {item: quantity}

But all this seems a bit verbose so you can go for shorthand syntax.

Type Hinting Functions

from collections.abc import Callabledef apply_cart(        func: Callable[[str, int], dict[str, int]],        item: str,        quantity: int) -> dict[str, int]:    return func(item, quantity)def cart(item: str, quantity: int) -> dict[str, int] | None:    if item == "" or quantity == 0:        return None    return {item: quantity}

The function apply_cart() accepts a callable object (which can be a function or any other callable object) and two arguments (item and quantity) and returns a dictionary containing the key and value.

The Callable type hint provides a list of arguments ([str, int]) that the callable object accepts. In this case, func() expects strings and integers. Callable's second parameter is the return value (dict[str, int]), which is a dictionary.

The second function, cart(), is identical to the previous example and returns a dictionary if everything is correct.

cart_item = apply_cart(cart, "Mouse", 2)print(cart_item)--------------------{'Mouse': 2}

The apply_cart() function is invoked with the cart (a function) and two arguments ('Mouse', 2).

The apply_cart() function calls the cart() function with the given arguments and returns a result.

Here's an optimisation, if you have many arguments of different types, you can use ellipsis (...) rather than passing different input types.

The ellipsis literal (...) indicates that callable can accept any arbitrary list of arguments.

from collections.abc import Callablefrom typing import TypeVar, AnyT = TypeVar("T")def apply_cart(        func: Callable[..., T],        *args: Any) -> T:    return func(*args)

The apply_cart() function has been updated and now accepts an arbitrary list of arguments of any type ([..., T]) as well as variadic argument (*args) of any type (Any).

The first parameter in the Callable is an ellipsis (...), suggesting that any arbitrary parameter list is permitted.

The second parameter is a type variable (T = TypeVar("T")) that can work with any type. It indicates that the callable can accept any type and return an element of that type.

You can also use a parameter specification variable (ParamSpec) instead of an ellipsis to make callable objects accept any number of positional or keyword arguments.

from collections.abc import Callablefrom typing import TypeVar, ParamSpecP = ParamSpec("P")T = TypeVar("T")def apply_cart(        func: Callable[P, T],        *args: P.args) -> T:    return func(*args)

In this case, the first parameter of Callable is a parameter specification variable (P = ParamSpec("P")) indicating arbitrary list of arguments is acceptable. The second parameter (T) indicates that any type is acceptable.

The function apply_cart() also accepts a variadic argument (*args) of type P.args that represents a tuple of any number and type of positional arguments. To annotate **kwargs, P.kwargs must be used that represents the mapping of keyword parameters to their values.

Instead of using the ellipsis literal, you may now utilise ParamSpec and TypeVar to enable callable objects to accept any number of positional or keyword arguments of any type.

The code is further updated by adding **kwargs, which allows the function to accept keyword arguments of arbitrary length.

from collections.abc import Callablefrom typing import TypeVar, ParamSpecP = ParamSpec("P")T = TypeVar("T")def apply_cart(        func: Callable[P, T],        *args: P.args,        **kwargs: P.kwargs) -> T:    return func(*args, **kwargs)def cart(item: str, quantity: int) -> dict[str, int] | None:    if item == "" or quantity == 0:        return None    return {item: quantity}cart_item = apply_cart(cart, "Mouse", quantity=2)print(cart_item)

Type Hinting Iterables

Consider the following function that takes a list of names and sorts them in ascending order.

def sort_student(names: list[str]) -> None:    sorting = sorted(names)    for name in sorting:        print(name)n = ["Gojo", "Yuta", "Yuji", "Megumi"]sort_student(n)

The list[str] type is used to type hint the parameter names to indicate that it expects a list of strings.

But what if you want names to be a type tuple or set instead? You need to refactor your code. In this case, you need to change the type in one location, however, this might be hard in complex and big projects.

You can use the type Iterable to make the function accept any iterable object.

from collections.abc import Iterabledef sort_student(names: Iterable[str]) -> None:    sorting = sorted(names)    for name in sorting:        print(name)n1 = ("Gojo", "Yuta", "Yuji", "Megumi") # No errorn2 = ["Gojo", "Yuta", "Yuji", "Megumi"] # No errorn3 = {"Gojo", "Yuta", "Yuji", "Megumi"} # No error

Type Aliases for Better Readability

The complex type hints for callable objects like in the above case, arguments, and return values of the functions kind of feel cumbersome to write.

Why can't simplify the complex type by giving them an alias(name)? You can create an alias for the type in the same way you would create a variable.

type ReturnCart = dict[str, int] | Nonedef cart(item: str, quantity: int) -> ReturnCart:    if item == "" or quantity == 0:        return None    return {item: quantity}cart_item = apply_cart(cart, "Mouse", quantity=2)print(cart_item)

In the above example, an alias ReturnCart is created and contains the type of return value (dict[str, int] | None).

The alias is defined with the soft keyword type, which creates an instance of TypeAliasType. This keyword was added in Python 3.12. If you are using an older version of Python, you can choose an alternate method.

Type aliases can also be created like you would create a variable using a simple assignment.

ReturnCart = dict[str, int] | None

To clarify, you can mark it with TypeAlias to explicitly show this as a type alias, not a simple variable.

from typing import TypeAliasReturnCart: TypeAlias = dict[str, int] | None

The methods described above will assist you in simplifying complex types, and the key benefit is that you will only need to edit the type in one spot, eliminating the need for refactoring.

Another advantage of using type aliases is that they can be reused in other portions of your code, which improves code readability.

Type Checker Tools

Python entirely ignores type hints and determines the type of variables, function arguments, and return values at runtime.

Mypy, a popular third-party tool, can enforce type-checking in Python. Since it is a third-party tool, you must install it using the following command.

python -m pip install mypy

This will provide you access to the mypy command to type-check your program.

Static Type Checking Using Mypy

The function sort_student() takes an iterable object and sorts the student names in ascending order. Save this function to a file, such as student.py.

# student.pyfrom collections.abc import Iterabledef sort_student(names: Iterable[str]) -> None:    sorting = sorted(names)    for name in sorting:        print(name)n1 = ("Gojo", "Yuta", "Yuji", "Megumi")sort_student(n1)

To type check your program using mypy, enter the command mypy followed by the file name you want to check in the terminal.

> mypy student.py Success: no issues found in 1 source file

The type checker identified no errors. One thing you may have noticed is that the code contains the function call (sort_student(n1)), which should have created the output, but there was none in the terminal except the message created by mypy.

This means that mypy does not execute your code, instead, it examines if the values match the expected type based on the type hints.

What if you made a mistake and the actual values do not match the expected type?

# student.pydef sort_student(names: list[str]) -> None:    sorting = sorted(names)    for name in sorting:        print(name)n1 = ("Gojo", "Yuta", "Yuji", "Megumi")sort_student(n1)

This function expects a list of strings but the tuple was passed. If you now check your code, it will generate the following error.

> mypy student.pystudent.py:8: error: Argument 1 to "sort_student" has incompatible type "tuple[str, str, str, str]"; expected "list[str]"  [arg-type]Found 1 error in 1 file (checked 1 source file)

Mypy has identified the error on line number 8 saying that the argument passed to sort_student() has an incompatible type tuple[str, str, str, str], it is expected to be a list[str].

Resources

Type cheat sheet

Conclusion

Type hints are optional in Python but they can be used to make the code more readable and using type hints can avoid possible bugs in your code.

In this article, you've learned:

Implementing type hints
Alternative types for single data and different types for multiple values using pipe (|) and Union
Callable type for type hinting functions
Ellipsis (...) and TypeVar in Callable type to make a callable object accept an arbitrary list of arguments of any type
ParamSpec instead of an ellipsis with TypeVar in Callable type to make a good combination for a callable object accepting an arbitrary list of arguments of any type
Iterable type for type hinting iterable objects
Type aliases to simplify complex type
Mypy for type checking in Python

There is much more you can do with type hints in Python.

🏆Other articles you might be interested in if you liked this one

Best Practices: Positional and Keyword Arguments in Python

Decorators in Python and How to Create a Custom Decorator?

Create a WebSocket Server and Client in Python.

Create and Interact with MySQL Database in Python

Understanding the Different Uses of the Asterisk(*) in Python?

Yield Keyword in Python with Examples?

That's all for now

Keep Coding

Best Practices: Positional and Keyword Arguments

Sachin Pal — Tue, 16 Apr 2024 04:30:56 GMT

A function or a method can accept both positional and keyword arguments and that's you who decide whether a function/method will accept positional or keyword argument.

You can choose what is best for your function either positional or keyword parameters according to the requirement.

In this article, you'll learn what are the best practices for positional and keyword arguments.

Positional & Keyword Arguments

Positional arguments are passed positionally in the function/method call while the keyword or named arguments are passed by the parameter's name along with the corresponding values.

# Positional and Keyword Argsdef func(p1, p2, p3):    print(p1, p2, p3)

In this example, the function func() takes three arguments (p1, p2, and p3) and prints them. This function does not explicitly state how the parameters should be passed when you call it.

Consider the following function calls that are all valid.

# Option 1func("Here", "we", "go")# Option 2func("Here", "we", p3="go")# Option 3func(p1="Here", p2="we", p3="go")# Option 4func("Here", p2="we", p3="go")--------------------Here we goHere we goHere we goHere we go

Well, this looks fine that you have the flexibility to pass arguments as you wish until you are following the correct way.

Stick to the Rule

You need to stick to the rule when passing a combination of positional and keyword arguments.

The following function call will result in an error as it violates the syntax set by Python.

func("Here", p2="we", "go")

The p1 and p3 are passed positionally while the p2 is passed as a keyword argument.

...    func("Here", p2="we", "go")                              ^SyntaxError: positional argument follows keyword argument

You can see that this program raised a SyntaxError stating that the positional argument cannot be passed after the keyword argument.

When you pass the keyword argument to a function, all subsequent arguments must also be passed as keywords. Positional arguments should always come before keyword arguments.

Restrict Arguments to be Positional-only and Keyword-only

One of the best practices is to bound the arguments to be positional-only and keyword-only.

You can create a function that accepts simply positional arguments, keyword arguments, or a combination of the two.

Here's how you can achieve it.

# Takes Positional-only Argsdef func1(p1, p2, p3, /):    print(p1, p2, p3)# Takes Keyword-only Argsdef func2(*, kw1, kw2, kw3):    print(kw1, kw2, kw3)# Takes Mix Argsdef func3(pos, /, pos_kw, *, kw):    print(pos, p_kw, kw)

In this example, the function func1() allows only positional arguments due to the slash (/) at the end of the function definition, likewise, the function func2() allows only keyword arguments because of the asterisk (*) at the beginning.

But, the function func3() accepts mixed arguments because pos is positional-only, pos_kw can be positional or keyword, and kw is keyword-only.

Read - Why Slash and Asterisk Used in Function Definition?

# Passed positional-only argsfunc1("positional", "only", "arguments")# Passed keyword-only argsfunc2(kw1="keyword", kw2="only", kw3="arguments")# Passed mixed argsfunc3("mixed", "arguments", kw="passed")--------------------positional only argumentskeyword only argumentsmixed arguments passed

This approach can be useful when developing an API and you don't want users to pass values through the parameter name. In that situation, you can configure the API to only accept positional values, and you can modify the parameter name in the future without breaking anything.

Also, if you're writing a function for e-commerce, you can utilise keyword-only parameters to avoid confusion between price, item, and so on.

Optional Arguments/Arguments with Default Value

You can set a default value for an argument if you need to by assigning a value to the parameter in the function definition.

# Function with an optional paramdef optional_param(name, age, height=170):    print(f"Name  : {name}")    print(f"Age   : {age}")    print(f"Height: {height}")

The function optional_param() has a height parameter set to a default value of 170. If you don't pass it in a function call, Python will assign that default value. This means that it became an optional argument.

optional_param("John", 32)--------------------Name  : JohnAge   : 32Height: 170

You can override the default value by specifying a different value. This can be helpful when you want to set a minimum or a maximum limit.

Variadic Arguments

When you're unsure or don't want to specify a big list of parameters in a function definition, make it accept a variadic (any number of arguments) argument instead.

# Takes any no. of argsdef send_invitation(*args):    print("Invitation sent to:")    for everyone in args:        print(everyone)

You can pass any number of positional arguments to the function send_invitation() due to *args. This *args expression indicates that the function can accept any number of arguments and the values are stored in a tuple.

send_invitation(    "John",    "Max",    "Cindy")--------------------Invitation sent to:JohnMaxCindy

Likewise, you can make a function accept variadic keyword arguments using **kwargs.

# Takes any no. of keyword argsdef party_items(**kwargs):    print("Party items:")    for key, value in kwargs.items():        print(f"{key}: {value}")party_items(    Poppers=3,    Ballons=120,    Sparkles=10)

The **kwargs stores keyword arguments as the key/value pair in a dictionary, so you can perform any related operations on the keys and values.

Read - Understanding \args and \*kwargs in Python**.

Argument Ordering

Arguments must be passed in the same order as the parameters appear in the function definition.

def details(name, age, occupation):    print(f"Name      : {name}")    print(f"Age       : {age}")    print(f"Occupation: {occupation}")details(32, "John", "Writer")

When you run this code, you'll get the name as 32 and the age as John due to a messed up argument order when the function is called.

Name      : 32Age       : JohnOccupation: Writer

But if you pass the values using the parameter name in any order, you'll get the desired result.

details(age=32, name="John", occupation="Writer")--------------------Name      : JohnAge       : 32Occupation: Writer

This is not the ideal approach because each function will differ; for example, you may create functions that accept only positional or keyword arguments.

So, the bottom line is to always pass the arguments in order as the parameters appear in the function definition.

Argument Type Hinting

It's not a terrible idea to specify what argument type should be given when invoking the function. This will indicate that the specified type of argument should be passed.

# Type hintingdef func(arg1: int, arg2: float):    print(f"{arg1} type: {type(arg1)}")    print(f"{arg2} type: {type(arg2)}")

When you call the function func(), arg1 should be passed as an integer and arg2 should be passed as a decimal number. This is called type hinting.

func(2, 3.9)--------------------2 type: <class 'int'>3.9 type: <class 'float'>

It's not like you must pass those specified types of arguments, you can pass any type as you wish, and Python won't throw an error because the interpreter completely ignores the type hints.

func("2", 3.9)--------------------2 type: <class 'str'>3.9 type: <class 'float'>

Python is a dynamically typed language, so type hints won't be very useful but in some cases, they can be helpful to avoid bugs.

Naming the Arguments Properly

You must name your parameters in a way that defines their purpose. Let's say you are creating a program for a vehicle company, so you must name your parameters according to that for better readability.

Conclusion

Python provides a lot of freedom to developers. Each programming language is distinct and different, and following best practices elevates them even further.

A developer should understand how to get the most out of the programming language, including how to leverage their experience and best practices.

In this article, you've learned some of the best practices for positional and keyword arguments that are as follows:

Sticking to the proper syntax for passing the positional and keyword arguments
Making the function accept only positional or keyword arguments or mixed arguments
Setting a default value for the argument
If you are unsure how many arguments should a function accept, use variadic arguments (*args and **kwargs) in that case.
Arguments should be correctly placed to avoid unpredictable behaviour.
Can use type hints to avoid bugs
Name the parameters in a way that defines their purpose

🏆Other articles you might be interested in if you liked this one

Understanding args and kwargs in Python: Best Practices and Guide

Why Slash and Asterisk Used in Function Definition?

Create a WebSocket Server and Client in Python.

Decorators in Python and How to Create a Custom Decorator?

Understanding the Different Uses of the Asterisk(*) in Python?

The map() Function in Python for Mapping a Function to Each Item in the Iterable?

That's all for now

Keep Coding

Creating a MySQL Database in Python

Sachin Pal — Fri, 05 Apr 2024 04:30:27 GMT

Databases are crucial for storing and managing data. In this article, you'll learn to create and interact with MySQL database in Python.

Installing PyMySQL

PyMySQL is a MySQL client library written in Python that allows you to create and interact with MySQL databases.

This is a third-party library, therefore you must install it on your system. To install it using pip, run the following line in your terminal.

pip install pymysql--------------------- OR ---------------------python -m pip install pymysql

Note: You must have a MySQL server installed in your system.

Creating MySQL Database

To begin, import the PyMySQL library into your project's environment to handle database operations.

# Importing the required libimport pymysql

PyMySQL includes a connect() function that accepts the necessary arguments, such as host, username, password, database name, and so on, to establish a connection with the database server.

In this step, you will need access to your MySQL server's username and password.

# Initialize connection with servermysql_db = pymysql.connect(    host="localhost",    user="root",    password="********")

In the above code, the host is where your MySQL server is hosted; in this case, it is hosted on a local machine, therefore the value "localhost" is provided.

The user is your MySQL server's username, which is "root" by default, and the password is the one you specified when you first set up the server.

To interact with the MySQL database, you must first create a cursor object for it using the cursor() function.

# Database cursorcursor = mysql_db.cursor()

This step involves running a MySQL query to create a database on the MySQL server using the cursor (mysql_db.cursor()) object.

# SQL query to create databasecursor.execute("CREATE DATABASE IF NOT EXISTS pokemon_db")cursor.execute("SHOW DATABASES")

The cursor.execute() executes the SQL query. The first query says "Create a database named pokemon_db if it doesn't exist already" and the second query says "Show all the databases reside on the server".

Finally, disconnect the database connection and cursor object with the close() function.

# Closing the database cursor and connectioncursor.close()mysql_db.close()

When you run the code, nothing will appear on the console, but your database has been created on the server. You can check in the MySQL Workbench.

To display all of the databases on the server using Python, add the following code to the script.

# Displaying databasesfor databases in cursor:    print(databases)

Now, when you rerun the code, you'll see all the databases residing on the server are displayed on the console.

('books_db',)('information_schema',)('mysql',)('performance_schema',)('pokemon_db',)('sys',)

You can see your newly created database (pokemon_db) is being displayed.

Interacting with Database

You may simply interact with this newly generated MySQL database by adding tables and columns and performing CRUD operations.

Creating a Database Table

You have established a MySQL database called pokemon_db. Now you must create a table with some fields to store data related to the fields.

Create a new file in your project directory and place the following code within it.

# Importing PyMySQL and cursorsimport pymysql.cursors# Initialize connection with databasemysql_db = pymysql.connect(    host="localhost",    user="root",    password="********",    database="pokemon_db",    cursorclass=pymysql.cursors.DictCursor)# Database cursorcursor = mysql_db.cursor()# Function to create a tabledef create_db_table():    cursor.execute('''                CREATE TABLE IF NOT EXISTS pokemon (                    id INT AUTO_INCREMENT PRIMARY KEY,                    name VARCHAR(500) NOT NULL UNIQUE,                    cp INT(50) NOT NULL,                    hp INT(50) NOT NULL                )            ''')    mysql_db.commit()if __name__ == "__main__":    create_db_table()cursor.close()

This time, the database name (pokemon_db) is supplied in the connect() function. This implies that a connection will be established to the pokemon_db database.

The cursorclass is set to cursors.DictCursor, a cursor that returns results in dictionary format.

The create_db_table() function creates a table named "pokemon" containing the following fields:

id: This field assigns a serial number for each entry made in the database automatically due to "AUTO_INCREMENT".
name: stores the Pokemon name.
cp: stores the combat power of the Pokemon.
hp: stores the high power of the Pokemon.

The changes are saved to the database using mysql_db.commit(). After running the code, the table will be created with the specified fields.

Adding Data to the Database

...# Adding data to the databasedef add_entry():    # SQL query    query = '''            INSERT INTO `pokemon` (`name`, `cp`, `hp`)             VALUES (%s, %s, %s)            '''    # Adding three entries    cursor.execute(query, ('Charizard', 120, 200))    cursor.execute(query, ('Pikachu', 60, 100))    cursor.execute(query, ('Squirtle', 78, 102))    # Committing the changes    mysql_db.commit()if __name__ == "__main__":    # create_db_table()    add_entry()cursor.close()

The add_entry() function is defined and added to the code from the previous section.

Inside the function, an SQL query is defined to insert data in the pokemon table to the corresponding fields. Next, the function executes the SQL query multiple times, each time with different values for the Pokemon's name, combat power (cp), and high power (hp).

After adding the entries, the function commits the changes to the database using mysql_db.commit().

When you run the function, the data will be added to the database.

Reading Data from the Database

...# Reading data from the databasedef read_entry():    # SQL query    query = '''    SELECT `name`, `cp`, `hp` FROM `pokemon`;    '''    cursor.execute(query)    # Fetching data from the database    for data in cursor.fetchall():        print(            data['name'],            data['cp'],            data['hp']        )if __name__ == "__main__":    # create_db_table()    # add_entry()    read_entry()cursor.close()

The read_entry() function executes an SQL query that selects all values from the table pokemon. The data is then fetched using the cursor.fetchall() function.

You'll get all the entries inserted into the database when you run the code.

Charizard 120 200Pikachu 60 100Squirtle 78 102

Updating Data in the Database

...# Function to update an entrydef update_entry():    query = '''            UPDATE `pokemon`            SET `cp` = %s            WHERE `id` = %s            '''    # Executing SQL query with values    cursor.execute(query, (140, 2))    # Committing the changes    mysql_db.commit()if __name__ == "__main__":    # create_db_table()    # add_entry()    update_entry()    read_entry()cursor.close()

The update_entry() function is defined, and within it, an SQL query is written to update the pokemon table by setting the value for the cp field for the supplied id.

The cursor.execute() function executes the query that updates the cp of the Pokemon to 140 whose id is equal to 2.

The changes are then saved to the database using mysql_db.commit(). When you run the code, you'll see the change in the value.

Charizard 140 200Pikachu 60 100Squirtle 78 102

You can see that the Charizard's cp has been updated, and it is now 140 because it has an id of 2, which may differ in your situation.

Deleting Data from the Database

# Function to delete the entrydef delete_entry():    query = '''            DELETE FROM `pokemon`            WHERE `id` = %s            '''    # Executing SQL query for deletion    cursor.execute(query, 2)    # Committing the changes    mysql_db.commit()if __name__ == "__main__":    # create_db_table()    # add_entry()    # update_entry()    delete_entry()    read_entry()cursor.close()

The delete_entry() function executes an SQL query to remove the entire record from the pokemon table with the supplied id.

When you run the code, you'll see that the entire record of the id equal to 2 has been deleted.

Pikachu 60 100Squirtle 78 102

Conclusion

You may work with MySQL databases in Python by using MySQL client libraries, and in this article, you've learned how to create and communicate with MySQL databases using the PyMySQL library.

First, you learned to create a MySQL database using the PyMySQL library in Python.

You interacted and performed the following operations after the database was established:

Created a MySQL database table
Inserted the data into the database
Reading that data from the database
Updating the data in the database
Deleting the data from the database

There are various libraries available for building a MySQL database in Python, and the process of creating and communicating with the database is nearly identical to that described in this article.

🏆Other articles you might be interested in if you liked this one

Create and integrate MySQL database in Flask app in Python.

Create and connect SQLite database with Flask app.

How to use Flask's Blueprint to structure your Flask app much better?

What are sessions and how to create a session in Flask?

Create a WebSocket server and client in Python.

How do decorators in Python work and how to create a custom decorator?

That's all for now

Keep Coding

Slash(/) and Asterisk(*) in Function Definition

Sachin Pal — Tue, 26 Mar 2024 04:30:33 GMT

When you read the documentation of some functions you see a slash (/) and an asterisk (*) passed in the function definition. Why are they passed in a function?

Why Slash and Asterisk are Used?

The slash (/) and asterisk (*) are used to determine how an argument should pass when a function is called.

Simply said, the parameters on the left side of the slash (/) must be supplied as positional-only arguments, whereas the parameters on the right side can be passed as either positional or keyword arguments.

An asterisk (*) indicates that the parameters on the right side must be supplied as keyword-only arguments, while the parameters on the left side can be passed as positional or keyword arguments.

def func(pos1, pos2, /, pos_or_kwd, *, kwd1, kwd2):      -----------    ----------     ----------        |             |                  |        |        Positional or keyword   |        |                                - Keyword only         -- Positional only

Left-side Parameters	Symbol	Right-side Parameters
Positional-only arguments	`/`	Either positional or keyword arguments
Either positional or keyword arguments	`*`	Keyword-only arguments

Slash and Asterisk in Function Parameter

Here's a simple example to show that when slashes and asterisks appear in a function definition, then parameters are bound to positional-only and keyword-only arguments, respectively.

# Normal Functiondef func_params(x, y, /, *, z):    print(x, y, z)

You can conclude that x and y are positional-only parameters, while z is keyword-only. Let's experiment to see if the parameters are bound to positional-only and keyword-only arguments.

params = func_params(2, y=3, z=4)

The parameter x is supplied as a positional argument, while y and z are passed as keywords or named arguments. When you run the code, the following output will be produced.

Traceback (most recent call last):  ...    params = func_params(2, y=3, z=4)             ^^^^^^^^^^^^^^^^^^^^^^^^TypeError: func_params() got some positional-only arguments passed as keyword arguments: 'y'

Python raises a TypeError indicating that the positional-only argument (y) was given as a keyword argument. If you simply supply y as a positional argument, the code will not generate any errors.

# Normal Functiondef func_params(x, y, /, *, z):    print(x, y, z)# x & y passed as positional argument and z as a keyword argumentparams = func_params(2, 3, z=4)--------------------2 3 4

Can Asterisk Be Used Ahead of Slash

The asterisk (*) must come after the slash (/). Now, you may be wondering if this is a rule.

You may think of it as a rule, if you use both a slash and an asterisk, the slash must come before the asterisk, otherwise, Python will throw a syntax error.

# Used asterisk before slashdef func_params(x, y, *, /, z):    print(x, y, z)params = func_params(2, 3, z=4)

The code has been modified, and the asterisk is used before the slash. When you run the code, you'll get a syntax error.

  ...    def func_params(x, y, *, /, z):                             ^SyntaxError: / must be ahead of *

Logically inserting an asterisk before the slash doesn't make sense because the parameters on the right side of the asterisk are keyword-only, but the parameters on the left side of the slash are positional-only.

This will not work at all. Here is an example to demonstrate this situation.

# Used asterisk before slashdef func_params(x, y, *, a, b, /, z):    print(x, y, a, b, z)params = func_params(2, 3, a=5, b=6, z=4)

The parameters a and b are between the asterisk and the slash, making it unclear if they are positional or keyword-only parameters. This code will still throw a syntax error even after passing a and b as keyword arguments.

Using Either One of Slash (/) or Asterisk (*) in the Function's Parameter?

You can utilise either one depending on the type of parameters you require, whether positional-only or keyword-only. Here's an example showing the use of only a slash (/) in a function definition.

def func_params(pos1, pos2, /, pos_or_kw):    print(pos1, pos2, pos_or_kw)params = func_params(2, 3, pos_or_kw=4)--------------------2 3 4

The parameters pos1 and pos2 are positional-only parameters and pos_or_kw is a positional-or-keyword parameter.

Similarly, you can use a bare asterisk (*) also in the function definition.

def func_params(pos_or_kw, *, kw1, kw2):    print(pos_or_kw, kw1, kw2)params = func_params(4, kw1=3, kw2=2)--------------------4 3 2

Writing a Function that Accepts Positional-only Arguments

If you want to create a function or class that only accepts positional arguments, add the slash (/) at the end of the parameter list.

# Function that takes only positional argumentsdef prescription(med1, med2, med3, /):    print("Prescribed Meds:")    print(f"Med 1: {med1}")    print(f"Med 2: {med2}")    print(f"Med 3: {med3}")prescription("Paracetamol", "Omeprazole", "Ibuprofen")--------------------Prescribed Meds:Med 1: ParacetamolMed 2: OmeprazoleMed 3: Ibuprofen

The function prescription() takes three arguments: med1, med2, and med3. You can see that a slash (/) is written at the end of the parameters which makes them positional-only.

You'll get an error if you try to pass a keyword argument to the function prescription().

prescription("Paracetamol", "Omeprazole", med3="Ibuprofen")--------------------Traceback (most recent call last):  ...    prescription("Paracetamol", "Omeprazole", med3="Ibuprofen")TypeError: prescription() got some positional-only arguments passed as keyword arguments: 'med3'

Writing a Function that Accepts Keyword-only Arguments

You can use a bare asterisk (*) in a function definition to make the parameters on the right side keyword-only.

If you want a function that only accepts keyword parameters, simply include a bare asterisk (*) at the beginning of the function parameter list.

# Function that takes only keyword argumentsdef prescription(*, med1, med2, med3):    print("Prescribed Meds:")    print(f"Med 1: {med1}")    print(f"Med 2: {med2}")    print(f"Med 3: {med3}")prescription(med1="Paracetamol", med2="Omeprazole", med3="Ibuprofen")-------------------Prescribed Meds:Med 1: ParacetamolMed 2: OmeprazoleMed 3: Ibuprofen

The function prescription() now only accepts keyword arguments and disallows positional arguments.

Valid & Invalid Function Definitions

Here are some valid function definitions that can go well with slash and asterisk. Source

def f(p1, p2, /, p_or_kw, *, kw):def f(p1, p2=None, /, p_or_kw=None, *, kw):def f(p1, p2=None, /, *, kw):def f(p1, p2=None, /):def f(p1, p2, /, p_or_kw):def f(p1, p2, /):

You are most likely familiar with the rules for defining positional and keyword parameters, and the function definitions mentioned above are valid in accordance with those rules.

However, the following function definitions are considered invalid because they aren't defined as per the rules.

def f(p1, p2=None, /, p_or_kw, *, kw):def f(p1=None, p2, /, p_or_kw=None, *, kw):def f(p1=None, p2, /):

Once the parameter with default value has been defined, all subsequent parameters (optional for parameters after the asterisk) must be defined with the default value to avoid ambiguity in function calls.

Conclusion

The slash (/) and asterisk (*) determine how an argument should be passed when you call the function.

A slash (/) in the function definition indicates that the parameters on the left side must be treated as positional-only.

A bare asterisk (*) in function definition indicates that the parameters on the right side must be treated as keyword-only.

You can use either both of them or a single in the function definition. When you are using both slash and asterisk, the slash must be inserted before the asterisk.

🏆Other articles you might be interested in if you liked this one

Why if __name__ == "__main__" is used in Python programs?

What is the yield keyword in Python?

Create a WebSocket server and client in Python.

How do decorators in Python work and how to create a custom decorator?

What is __getitem__ method in Python class?

The map() function in Python for mapping a function to each item in the iterable?

That's all for now

Keep Coding

A Comprehensive Guide to Decorators in Python

Sachin Pal — Wed, 20 Mar 2024 04:30:43 GMT

You might have encountered functions or classes decorated with functions prefixed with "@", for example, @random. These are known as decorators as they are placed above your class or function.

In this tutorial, you will learn about:

Decorators in Python
How to create a custom decorator
The working of a decorator and how it modifies the original function
How to create a custom decorator that accepts arguments and how it works
Applying multiple decorators on top of a function and how they operate on the original function

Decorator

What is a decorator in Python? A decorator is an advanced function in Python that modifies the original function without changing its source code. It offers a way to add functionality to existing functions.

If you want to create a class but don't want to write the required magic methods (such as the __init__ method) inside it, you can use the @dataclass decorator on top of the class, and it will take care of the rest.

# @dataclass decorator examplefrom dataclasses import dataclass@dataclassclass Pokemon:    name: str    high_power: int    combat_power: int    def double_hp(self):        return self.high_power * 2monster = Pokemon("Charizard", 200, 180)print(monster.__dict__)charizard_hp_doubled = monster.double_hp()print(charizard_hp_doubled)

The @dataclass decorator adds the required magic method within the Pokemon class without changing the source code of the class.

This class works just like any other normal class that contains the __init__ method in Python, and you can access its attributes, and create methods.

{'name': 'Charizard', 'high_power': 200, 'combat_power': 180}400

By decorating @dataclass on top of the class eliminates the need to write the initializer method and other required methods within the class.

Creating a Custom Decorator

Although there are several pre-built decorator functions available, there is also a way to design a custom decorator function for a particular task.

Consider the following simple decorator function that logs a message on the console whenever the original function is called.

# Decorator functiondef log_message(func):    def wrapper():        print(f"{func.__name__} function is called")        func()    return wrapper

The log_message() function is a decorator function that accepts another function (func) as its argument. Inside log_message(), a wrapper() function is defined which prints a message and calls func.

The log_message() returns the wrapper, effectively changing the behaviour of the original function (func).

You can now create a normal function and decorate it with the log_message() decorator function.

# Decorator function...@log_messagedef greet():    print("Welcome to GeekPython")greet()

Observe how the greet() function is decorated with the log_message() (@log_message) function. When decorating a function on top of another function, you have to stick to this convention.

When you run this code, you'll get the following result.

greet function is calledWelcome to GeekPython

Notice that the greet() function prints a simple message but the @log_message decorator altered the behaviour of the greet() function by adding a message before calling the original function. This modification happens while preserving the signature of the original function (greet()).

How Decorators Work?

Look at this part of the code from the above section where you used the @log_message to decorate the greet() function.

@log_messagedef greet():    print("Welcome to GeekPython")

The above code is equivalent to the following expression.

greeting = log_message(greet)

You will obtain the same outcome as before if you execute the code after making the following modifications.

# Decorator functiondef log_message(func):    def wrapper():        print(f"{func.__name__} function is called")        func()    return wrapperdef greet():    print("Welcome to GeekPython")greeting = log_message(greet)greeting()--------------------greet function is calledWelcome to GeekPython

The greet() function is passed to the log_message() function and stored inside the greeting. In the next line, the greeting is called just like any other function. What is happening and how does it work?

After this line (greeting = log_message(greet)) is executed, the variable greeting points to the wrapper() returned by log_message(). If you print the variable greeting, you'll get the reference of the wrapper() function.

greeting = log_message(greet)print(greeting)--------------------.wrapper at 0x0000024EF60C4C20>

This wrapper() function prints a message and has a reference to the greet() function as func and it calls this function within its own body to maintain the original functionality while adding extra behaviour.

Defining Decorator Without Inner Function

One may wonder why the code in the wrapper() function cannot be inserted inside the scope of the log_message() function like in the following code.

# Decorator functiondef log_message(func):    print(f"{func.__name__} function is called")    func()    return log_message@log_messagedef greet():    print("Welcome to GeekPython")greet()

In the above code, the code inside the wrapper() function is now placed within the log_message() function's scope. When you run the, you'll see that the greet() function's behaviour has changed but you get an error.

greet function is calledWelcome to GeekPythonTraceback (most recent call last):  ...    greet()TypeError: log_message() missing 1 required positional argument: 'func'

It says one argument is missing when you called the greet() function which means that the greet() function is now pointing to the log_message() function. But when you simply don't call the greet function, it won't throw any error.

...@log_messagedef greet():    print("Welcome to GeekPython")greet--------------------greet function is calledWelcome to GeekPython

There is little flexibility and very little you can do with it, yet in certain instances it will work.

Handling Function Arguments Within Decorator

What if you have a complex function that accepts arguments and processes them, then you can't approach this problem in this way.

# Decorator functiondef log_message(func):    print(f"{func.__name__} function is called")    func()    return log_message@log_messagedef greet(user):    print(f"Welcome to GeekPython: {user}")greet("Sachin")

This code will result in an error as the log_message() function doesn't have a helper function to handle the argument the greet() function accepts.

greet function is calledTraceback (most recent call last):  ...    @log_message     ^^^^^^^^^^^  ...    func()TypeError: greet() missing 1 required positional argument: 'user'

Defining Decorator With Inner Function to Handle Function Arguments

You can manage the arguments received by the greet() function by incorporating a nested function (wrapper()) within the log_message() decorator function, using *args and **kwargs as parameters.

# Decorator functiondef log_message(func):    def wrapper(*args, **kwargs):        print(f"{func.__name__} function is called")        func(*args, **kwargs)    return wrapper@log_messagedef greet(user):    print(f"Welcome to GeekPython: {user}")greet("Sachin")--------------------greet function is calledWelcome to GeekPython: Sachin

This time, the code printed the argument ("Sachin") supplied to the greet() function when it was called, so you didn't receive any errors.

The *args and **kwargs passed to the wrapper() is used to pass on the arguments to func (a reference for the original function) that enables the decorator function to handle the arguments accepted by the original function.

Returning Values from Decorator

In the example above, using greet("Sachin") resulted in the output. However, what if you wanted to return a value from the decorator?

@log_messagedef greet(user):    print(f"Welcome to GeekPython: {user}")    return f"User: {user}"# Trying to return a valuegreeting = greet("Sachin")print(greeting)

Since your decorator @log_message doesn't return a value directly, this code will return None.

greet function is calledWelcome to GeekPython: SachinNone

To handle this situation, you need to ensure that the wrapper() function returns the return value of the original function.

# Decorator functiondef log_message(func):    def wrapper(*args, **kwargs):        print(f"{func.__name__} function is called")        return func(*args, **kwargs)    return wrapper

When you run the following code, you'll get the value returned by the greet() function.

# Decorator functiondef log_message(func):    def wrapper(*args, **kwargs):        print(f"{func.__name__} function is called")        return func(*args, **kwargs)    return wrapper@log_messagedef greet(user):    print(f"Welcome to GeekPython: {user}")    return f"User: {user}"greeting = greet("Sachin")print(greeting)--------------------greet function is calledWelcome to GeekPython: SachinUser: Sachin

Creating Decorator that Accepts Argument

So far you've created simple decorators but decorators can also accept arguments. Consider the following decorator that accepts arguments.

# Decorator function to slice a stringdef slice_string(start=0, end=0, step=None):    def slice_decorator(func):        def slice_wrapper(*args, **kwargs):            print(f"Sliced from char {start} to char {end}.")            if func(*args, **kwargs) == "":                print("Text is not long enough.")            result = func(*args, **kwargs)            return result[start: end: step]        return slice_wrapper    return slice_decorator

In the above code, a decorator function slice_string() is defined. This (slice_string()) decorator function accepts three arguments: start (defaults to 0), end (defaults to 0), and step (defaults to None).

Within this (slice_string()) function, the inner function, slice_decorator(), takes another function (func) as an argument and within the slice_decorator() function, a wrapper function (slice_wrapper()) is defined.

The slice_wrapper() function takes any positional (*args) and keyword (**kwargs) arguments required to handle arguments if any accepted by the original function.

The slice_wrapper() function prints a simple message, and in the next line, checks if the argument is an empty string, if it is then a message is printed otherwise, the result is sliced from the specified range.

This slice_wrapper() function is returned by the slice_decorator() function and eventually, the slice_decorator() function is returned by the slice_string() function.

Now you can create a function and decorate @slice_string on top of it.

# Decorator function to slice a string...@slice_string(2, 7)def intro(text):    return text

The intro() function is defined that takes text as an argument and returns it. Two arguments (2 and 7) are passed to the @slice_string decorator, meaning the text will be sliced from the character at index 2 to index 7 (excluding the character at the 7th index).

# Decorator function to slice a string...chars = intro("Welcome to GeekPython")print(chars)--------------------Sliced from char 2 to char 7.lcome

Overall, a decorator function that accepts arguments typically involves the interaction of three functions: the outer function (the decorator itself) that accepts arguments, an inner function (the wrapper) that receives the original function, and a nested function (the innermost wrapper) that modifies the behaviour of the original function.

Here is another example of a decorator that accepts an argument.

import timedef sleep_code(t):    def sleep_decorator(func):        def sleep_wrapper(*args, **kwargs):            # Calculate start time            start = time.perf_counter()            print(f"Execution Delayed: {t} Seconds")            # Sleep for t seconds            time.sleep(t)            # Calculate end time            end = time.perf_counter()            # Evaluate execution time            print(f"Execution Took   : {round(end - start)} Seconds")            return func(*args, **kwargs)        return sleep_wrapper    return sleep_decorator@sleep_code(5)def slow_down(x, y):    return x**yobj = slow_down(2, 3)print(obj)

The @sleep_code decorator takes an argument t representing time in seconds. It modifies the behaviour of the original function (slow_down()) by delaying its execution using time.sleep(t) within the innermost function (sleep_wrapper()). Additionally, before returning the result, it prints the execution time taken by the code, which is measured using time.perf_counter().

When you run the code, you'll get the following result.

Execution Delayed: 5 SecondsExecution Took   : 5 Seconds8

Stacking Multiple Decorators on Top of a Function

So far you might have a pretty good idea about decorators and in this section, you'll see that multiple decorator functions can be stacked on top of another function. Here's a simple example.

# First decorator functiondef decorator__1(func):    def wrapper_d1(*args, **kwargs):        print(f"Called decorator 1")        return func(*args, **kwargs)    return wrapper_d1# Second decorator functiondef decorator_2(func):    def wrapper_d2(*args, **kwargs):        print(f"Called decorator 2")        return func(*args, **kwargs)    return wrapper_d2# Decorated with multiple decorators@decorator_1@decorator_2def log_message():    return "Message logged"message = log_message()print(message)

Both decorator_1() and decorator_2() have the same boilerplate and log a simple message.

The log_message() is decorated with both (@decorator_1 and @decorator_2) decorators with the @decorator_1 being on the topmost level followed by the @decorator_2.

When you run this code, you'll get the following result.

Called decorator 1Called decorator 2Message logged

You can see that messages logged by the decorators are in the exact order as they are stacked on top of the log_message() function.

If you reverse the order of these decorators, the messages will be logged in the same order as well.

# Reversed the order of the decorators@decorator_2@decorator_1def log_message():    return "Message logged"message = log_message()print(message)--------------------Called decorator 2Called decorator 1Message logged

The code is equivalent to passing log_message() through decorator_1() first, and then passing the result (decorator_1(log_message)) through decorator_2().

message = decorator_2(decorator_1(log_message)

Note: When you are stacking multiple decorators on top of the function, their order matters.

Practical Example

Here's an example that shows when decorating a function with multiple decorators, they need to be in order.

@slice_string(2, 7)@sleep_code(2)def intro(text):    return textchars = intro("Welcome to GeekPython")print(chars)

When you run this code, the execution will delayed for 4 seconds because the sleep_code() will be invoked twice.

Sliced from char 2 to char 7.Execution Delayed: 2 SecondsExecution Took   : 2 SecondsExecution Delayed: 2 SecondsExecution Took   : 2 Secondslcome

If you just reverse the order of the decorators in the above code, that would just work fine.

@sleep_code(2)@slice_string(2, 7)def intro(text):    return textchars = intro("Welcome to GeekPython")print(chars)

Output

Execution Delayed: 2 SecondsExecution Took   : 2 SecondsSliced from char 2 to char 7.lcome

You can observe the difference in the output in which the execution of the code took only 2 seconds. That's why you need to ensure that the decorators are in the correct order above the function.

Conclusion

Decorators modify the behaviour of the original function without changing the source code of the original function. They are advanced functions that do modification while preserving the original function's signature.

Python has several built-in decorator functions, and you can also create the custom decorator your program may need.

You saw when you define a custom decorator, you create a function returning a wrapper function. This wrapper function handles the modification and if your decorated function accepts arguments then it uses *args and **kwargs to pass on arguments. If the decorator function accepts arguments then you end up nesting the wrapper function into another function.

You also observed that in order to get the appropriate outcome, decorators must be stacked correctly on top of any function.

🏆Other articles you might be interested in if you liked this one

Why if __name__ == "__main__" is used in Python programs?

Serialize and deserialize Python objects using the pickle module.

Create a WebSocket server and client in Python.

Create and integrate MySQL database with Flask app using Python.

What is __getitem__ method in Python class?

What is the yield keyword in Python and how it is different from the return keyword?

That's all for now

Keep Coding

Python's yield Keyword - How it Works

Sachin Pal — Fri, 15 Mar 2024 04:30:12 GMT

To return a value from a Python function, you must have used the return keyword. When the program encounters the return keyword, it exits the function and returns the value to the caller before proceeding to execute additional code.

However, the case is entirely different with the yield keyword. It does control the flow of execution and returns a value like the return keyword but how it behaves is different.

yield Keyword

When a function contains a yield statement, it is considered a generator function. Theyieldkeyword alters the behavior of the function. How?

The yield keyword produces a series of values and returns when requested by the caller.

The yield keyword temporarily freezes the execution of the function and returns the value to the caller. As it freezes the execution, the state of the function is preserved and allows the function to resume from where it was left off.

yield in a Function

You can define a normal function as usual with the def keyword, but instead of the return keyword, use the yield keyword in the function body.

# A function with yield kw - a generator functiondef random_func():    # Yield values    yield "random1"    yield "random2"    yield "random3"# Instance of the functionobj = random_func()

The above code defines a function named random_func(), which is essentially a generator function due to the yield keyword in the function body.

Now, this is a simple generator function that yields three values and returns them when requested. The function is initialized (random_func()) and saved in the obj variable.

When you print the obj variable, you will receive the generator object.

...print(obj)--------------------0x0000026E95924880>

So, how do you access the yielded values one at a time? The values can be accessed by using the caller's __next__() method, as shown below.

# Instance of the functionobj = random_func()# Calling __next__() method on the instanceprint(obj.__next__())   # or print(next(obj))print(obj.__next__())   # or print(next(obj))print(obj.__next__())   # or print(next(obj))--------------------random1random2random3

The __next__() method is called three times to retrieve all three values returned by the function. You can also iterate over the obj variable with the for loop to get the values, which is more efficient than repeatedly calling the __next__() method when you need to consume all of the yielded values.

# Instance of the functionobj = random_func()# Iterating through the instancefor value in obj:    print(value)--------------------random1random2random3

How yield Works in a Function?

How does the execution flow in the function containing theyieldstatement? Let's keep it very simple and understand with an example.

import logging# Configuring logginglogging.basicConfig(level=logging.INFO)# Simple generator functiondef generate_val():    logging.info("Start")    value = 0    logging.info("Before entering the loop")    while True:        logging.info("Before yield statement")        yield value        logging.info("After yield statement")        value += 1gen = generate_val()print(next(gen))print(next(gen))print(next(gen))

The above code defines a basic generator function generate_val() that generates an infinite sequence of values starting from 0. It uses the yield statement to yield each value and increments it by 1 in each iteration of the loop.

The logging module is used to log the information. It logs various messages at different points in the generator function's execution to provide information about its flow.

When you run the code, the following output will be generated.

012INFO:root:StartINFO:root:Before entering the loopINFO:root:Before yield statementINFO:root:After yield statementINFO:root:Before yield statementINFO:root:After yield statementINFO:root:Before yield statement

Notice that when the program enters the loop and encounters the yield statement, it returns the value for the first iteration (next(gen)), suspends the execution flow immediately, and then resumes from where it left off for the next iteration.

This will happen with each iteration, and the program will never exit the function until stopped manually.

StopIteration Exception

In the above section, the program runs infinitely and never exhausts. What happens when the generator function has nothing else to evaluate?

Let's understand with a basic example.

# Function to yield only even numberdef only_even(n):    try:        num = 0        while num <= n:            if num % 2 == 0:                yield num            num += 1    except Exception as e:        print(f"Error - {str(e)}.")

The code above defines the only_even() function, which accepts an argument n and returns even numbers up to n, including n, if it is an even number.

# Function to yield only even number...even = only_even(3)

The function (only_even()) is instantiated with the argument 3 and stored in the even variable.

The function can now generate even numbers up to 3 (0 and 2), which means you can use the next() method on "even" twice. What happens if you call the next() method more than twice?

# Function to yield only even number...even = only_even(3)print(next(even))print(next(even))print(next(even))    # Raises an error

For the first two iterations, the function returns an even number. In the third iteration, the loop is executed until num <= n becomes false. If the generator produces no more items, the StopIteration error is returned.

02Traceback (most recent call last):  ...    print(next(even))          ^^^^^^^^^^StopIteration

To avoid the StopIteration error, you can use a for loop. A for loop automatically catches StopIteration exceptions and terminates gracefully when the iterator is exhausted.

# Function to yield only even number...even = only_even(9)for num in even:    print(num)--------------------02

When to Use yield?

You can use a yield statement in a function when you want to generate a sequence of values lazily (generate values when requested) rather than computing and storing them in memory until you print them.

Let's say you want to create a function that returns all the even numbers between 0 and 100000000.

Using the return keyword in the function computes all of the values and stores them in memory, when you print them, the program dumps them all at once.

def only_even(n):    number = []    for num in range(n):        if num % 2 == 0:            number.append(num)    return numbereven = only_even(100000000)for number in even:    print(number)

If you include a yield statement in the function, you can generate values as they are needed.

def only_even(n):    num = 0    while num <= n:        if num % 2 == 0:            yield num        num += 1even = only_even(100000000)for val in even:    print(val)

So, what is the main difference between the two programs? If you run the first program (function with return keyword), the function will compute the values and store them in memory until the program reaches the print statement, at which point it will print them, whereas the second program (function with yield keyword) generates values on-the-fly as they are requested, rather than computing and storing all of the values in memory at once.

Conclusion

The yield keyword in a function body means that the function is a generator function and the yield produces a sequence of values and returns when the value is requested by the caller.

The yield keyword remembers the state of the function and allows the function to resume from where it was left followed by the first iteration.

Let's recall what you've learned:

What is yield in Python?
How the execution flows in a function containing the yield
What happens when the generator has no more items to evaluate
When to use generator functions (function with yield keyword)

Reference: https://peps.python.org/pep-0255/#specification-yield

🏆Other articles you might be interested in if you liked this one

What are generator functions and how do they work in Python?

Serialize and deserialize Python objects using the pickle module.

Create a WebSocket server and client in Python.

Create multi-threaded Python programs using a threading module.

Create and integrate MySQL database with Flask app using Python.

Upload and display images on the frontend using Flask.

That's all for now

Keep Coding

Parse TOML files Using tomllib Module in Python

Sachin Pal — Mon, 11 Mar 2024 14:13:11 GMT

You might have seen files with the .toml extension in programming projects. What is that TOML?

TOML (Tom's Obvious Minimal Language) files can be used to store application metadata in an easily readable format. Its format is extremely simple due to its minimal syntax.

The TOML file, like the JSON file in web development, is used to store application-specific configuration data, however, the syntax varies and is often preferred in projects where human readability and simplicity are prioritized.

In this article, you will learn how to parse TOML files with the Python standard library tomllib.

TOML File

TOML file contains data in key-value pairs and has native types such as string, integer, float, booleans, tables, arrays, and more.

Here's an example of a TOML file.

# This is a TOML configuration filetitle = "TOML Config File"[author]name = "John Doe"dob = 1979-05-27T07:32:00-08:00[app]app_name = "Tomparse"version = 0.10site."google.com" = true[app.dependency]libs = ["tomllib", "tomli"][database]enabled = trueports = [ 8000, 8001, 8002 ]data = [ ["delta", "phi"], [3.14] ]temp_targets = { cpu = 79.5, case = 72.0 }

In the above file, title, owner, app, app.dependency, and database are the keys, and name, dob, app_name, version, etc., are the subkeys.

You can see values ["tomllib", "tomli"], [ ["delta", "phi"], [3.14] ] are array and array of tables respectively and value { cpu = 79.5, case = 72.0 } is inline table.

If the above file is put into JSON land, it would give the following structure.

{  "title": "TOML Config File",  "author": {    "name": "John Doe",    "dob": "1979-05-27T15:32:00.000Z"  },  "app": {    "app_name": "Tomparse",    "version": 0.1,    "site": {      "google.com": true    },    "dependency": {      "libs": [        "tomllib",        "tomli"      ]    }  },  "database": {    "enabled": true,    "ports": [      8000,      8001,      8002    ],    "data": [      [        "delta",        "phi"      ],      [        3.14      ]    ],    "temp_targets": {      "cpu": 79.5,      "case": 72    }  }}

Now you can understand how TOML syntax works if you haven't worked with the TOML files before.

You can use TOML for storing Python project metadata. See PEP 621 for more details.

tomllib - Parsing TOML Files

With the release of Python 3.11, the tomllib is added to the Python standard library to parse the TOML files using Python. This library is intended to read TOML files.

Having said that, the tomllib has only two functions: reading TOML from a file and loading TOML from a string.

Functions

tomllib.load(fp , parse_float=float)

The load() function reads a TOML file.

Parameters:

fp - A readable and binary file object. You can pass it positionally, but not as a keyword argument.
parse_float - You can specify a custom function for parsing the float in the TOML file. It defaults to using Python's float() function. This is a keyword-only argument.

Return value:

The load() function returns the TOML file content in dictionary format.

tomllib.loads(s, parse_float=float)

The loads() function loads the TOML from a string object.

Parameters:

s or string - A string object that contains the TOML document.
parse_float - It is the same as the load() function's parse_float.

Return value:

It also returns the results in dictionary format.

Parsing a TOML File

Here's a config.toml file that contains some configuration data of an application written in TOML. You'll use the tomllib's load() function to parse the file.

import tomllib# Opening a TOML file and reading in binary modewith open("config.toml", "rb") as tfile:    # Parsing TOML file content    result = tomllib.load(tfile)    print(result)

When you run the code, you'll get the file's content in a dictionary format.

{'title': 'TOML Config File', 'owner': {'name': 'John Doe', 'dob': datetime.datetime(1979, 5, 27, 7, 32, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=57600)))}, 'app': {'app_name': 'Tomparse', 'version': 0.1, 'site': {'google.com': True}, 'dependency': {'libs': ['tomllib', 'tomli']}}, 'database': {'enabled': True, 'ports': [8000, 8001, 8002], 'data': [['delta', 'phi'], [3.14]], 'temp_targets': {'cpu': 79.5, 'case': 72.0}}}

Since the data is in dictionary format you can separate the keys and values from the data.

import tomllib# Opening a TOML file and reading in binary modewith open("config.toml", "rb") as tfile:    # Parsing TOML file content    result = tomllib.load(tfile)    for key, value in result.items():        print(f"Key: {key}")        print(f"Value: {value}")

Now when you run this code, you'll get the following result.

Key: titleValue: TOML Config FileKey: authorValue: {'name': 'John Doe', 'dob': datetime.datetime(1979, 5, 27, 7, 32, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=57600)))}Key: appValue: {'app_name': 'Tomparse', 'version': 0.1, 'site': {'google.com': True}, 'dependency': {'libs': ['tomllib', 'tomli']}}Key: databaseValue: {'enabled': True, 'ports': [8000, 8001, 8002], 'data': [['delta', 'phi'], [3.14]], 'temp_targets': {'cpu': 79.5, 'case': 72.0}}

Exception Handling

The tomllib includes a TOMLDecodeError class that can handle errors encountered while decoding the TOML document.

import tomllib# Opening a TOML file and reading in binary modetry:    with open("config.toml", "rb") as tfile:        # Parsing TOML file content        result = tomllib.load(tfile)        for key, value in result.items():            print(f"Key: {key}")            print(f"Value: {value}")except tomllib.TOMLDecodeError as e:    print(e)

As you can see, the code is wrapped in a try-except block, and any errors are handled by the tomllib.TOMLDecodeError in the except block.

Expected '=' after a key in a key/value pair (at line 12, column 18)

Loading TOML from String

Assume you have a string with a TOML document. How do you parse that TOML? To achieve your desired result, use the tomllib.loads() function.

import tomllibmy_toml = """[app]app_name = "Tomparse"version = 0.1site."google.com" = true[app.dependency]libs = ["tomllib", "tomli"]"""load_toml = tomllib.loads(my_toml)print(load_toml)

The variable my_toml contains a string of TOML, which is passed to the tomllib.loads() function.

As usual, this code will return a dictionary and you'll get the following result.

{'app': {'app_name': 'Tomparse', 'version': 0.1, 'site': {'google.com': True}, 'dependency': {'libs': ['tomllib', 'tomli']}}}

Conclusion

So, if you write a configuration file or document in TOML and then want to parse it, you'll know what to do.

You can use the tomllib library, which was included in the Python standard library with the release of Python 3.11. This library includes functions that can be used to read TOML files.

🏆Other articles you might be interested in if you liked this one

Split your string into an array of words using the split() method in Python.

Map a function to each item in an iterable using the map() function.

Serialize and deserialize Python objects using the pickle module.

Why if __name__ == __main__ is used in Python programs?

Create a WebSocket server and client in Python.

Upload and display images on the frontend using Flask.

That's all for now

Keep Coding

How to Use split() Method in Python

Sachin Pal — Fri, 01 Mar 2024 15:21:31 GMT

The split() is a built-in method in Python and is used to split a string into substrings (words) based on a separator or delimiter which it takes as an argument. It returns a list containing the substrings separated by a specified delimiter(separator).

In simple terms, you have a sentence and you want to split it into words making them different entities, you can use the split() method and it will return a list containing the words.

sentence = "You are good"# Splitting sentence into array of wordsresult = sentence.split()print(result)

The above code will split the string stored in the sentence variable into three words within a list.

['You', 'are', 'good']

You may be wondering how the string got separated when there is no delimiter specified as an argument.

Syntax

.split(sep=None, maxsplit=-1)

sep - If sep is not specified, the whitespace character is used as the delimiter otherwise, the string will be split based on the specified sep.

maxsplit - If maxsplit is not specified, the string will split until it reaches the end, creating as many elements as possible. Otherwise, if maxsplit is specified, the string will be split into maxsplit + 1 elements. For example, if you've specified maxsplit=1, then the string will split into 2 elements.

Splitting based on delimiter

Suppose you have a comma-separated sentence and you want to split that sentence into a list of substrings based on the comma, you can simply pass the separator as a comma (",") and be done.

sentence = "Sachin, Yashwant, Rishu, Abhishek, are good"# Splitting sentence based on commaresult = sentence.split(sep=",")print(result)

If you run this code, you'll get the following output.

['Sachin', ' Yashwant', ' Rishu', ' Abhishek', ' are good']

You can see that the split(sep=",") split the sentence where the comma is placed and you get five substrings within a list.

You can split the string based on any delimiter such as commas, spaces, tabs, semicolons, etc., depending on the specific requirements of the data being processed.

Suppose you have the data containing names separated by a weird expression and you want to take out the names from the data.

sentence = "Sachin/<>/Yashwant/<>/Rishu/<>/Abhishek/<>/Yogesh"# Splittingresult = sentence.split(sep="/<>/")print(result)

This will work as fine as the code from the previous and you'll get the list of names from the data and you can perform whatever operation you want.

['Sachin', 'Yashwant', 'Rishu', 'Abhishek', 'Yogesh']

Using maxsplit

If you have a specific number in mind for how your string should be split, you can use the maxsplit argument to specify that number.

sentence = "Sachin/<>/Yashwant/<>/Rishu/<>/Abhishek/<>/Yogesh"# Using maxsplitresult = sentence.split(sep="/<>/", maxsplit=3)print(result)

In the above code, the maxsplit is set to 3 which means the string will be split into four substrings.

['Sachin', 'Yashwant', 'Rishu', 'Abhishek/<>/Yogesh']

The string got split into four substrings instead of three despite setting the maxsplit to 3. This happened because, by default, one is added to the number of maxsplit, which means maxsplit=3 is equivalent to maxsplit=3 + 1, resulting in a maximum of 4 parts after splitting.

Example

Assume you have a file containing a person's information (name, age, and email address), and you want to access the data from the file and perform some operations on it.

def get_details(filename):    with open(filename, "r") as file:        lines = file.readlines()        result = [line.split() for line in lines]        return resultdetails = get_details("names")

The function, get_details, reads a file given by the filename parameter. It then reads all the lines from the file using file.readlines(). Each line is split into a list of strings using the split() method, which splits the line based on whitespace by default. This creates a list of lists, where each inner list contains the individual words from each line.

Finally, the function returns this list of lists, where each inner list represents the words from each line in the file.

You can use the object (details) to access information about a specific person.

print(details[3])print(details[1])print(details[9])

This will print the information for the person on the third, first, and ninth indexes respectively.

['Isabella,', 'isabellanguyen@gmail.com,', '30']['Sophia,', 'sophiamartinez@yahoo.com,', '28']['Charlotte,', 'charlotte.gonzales@hotmail.com,', '34']

Conclusion

The split() method is used to split a string into substrings within a list. You can also split your string based on a delimiter which you can pass as an argument to the split() method and you can control the splits by setting a value to the maxsplit parameter.

🏆Other articles you might be interested in if you liked this one

Why if __name__ == __main__ is used in Python programs?

Create a WebSocket server and client in Python.

How to use map() in Python?

How to use pytest to test your code in Python?

Serialize and deserialize Python objects using the pickle module.

Hash password using the bcrypt package in Python.

That's all for now

Keep Coding

Python's getitem Method

Sachin Pal — Thu, 15 Feb 2024 15:43:01 GMT

You must have used the square bracket notation ([]) method to access the items from the collection such as list, tuple, or dictionary.

my_lst = ["Sachin", "Rishu", "Yashwant"]item = my_lst[0]print(item)

The first element of the list (my_lst) is accessed using the square bracket notation method (my_list[0]) and printed in the above code.

But do you know how this happened? When my_lst[0] is evaluated, Python calls the list's __getitem__ method.

my_lst = ["Sachin", "Rishu", "Yashwant"]# item = my_lst[0]item = my_lst.__getitem__(0)print(item)

This is the same as the above code, but Python handles it behind the scenes, and you will get the same result, which is the first element of my_lst.

You may be wondering what the __getitem__ method is and where it should be used.

The getitem Method

The __getitem__ method is usually implemented within Python classes to make the object of that class work with the square bracket notation.

To put it simply, you can use square bracket notation on the class's objects in the same way that you would with Python's built-in methods.

my_lst = ["Sachin", "Rishu", "Yashwant"]item = my_lst[0]print(item)# Using [] operator on class instance w/o __getitem__ methodclass Names:    def __init__(self):        passfriends = Names()result = friends["Sachin", "Rishu", "Yashwant"]

In the first block of code, the [] operator is used to access a list item.

In the second block of code, a class (Names) is defined without implementing the __getitem__ method. The [] operator is applied to the class's object (friends).

What happens when you run the above code? The first block of code will work because the [] operator is used on the built-in datatype, but the second block of code will fail because the class does not have the __getitem__ method to perform this functionality on the object.

SachinTraceback (most recent call last):  ....    result = friends["Sachin", "Rishu", "Yashwant"]TypeError: 'Names' object is not subscriptable

If you implement the __getitem__ method within the class, the above code will work just fine.

...# Using [] operator on class instance with __getitem__ methodclass Names:    def __init__(self):        pass    # Implemented the __getitem__ method    def __getitem__(self, name):        print(f'Name: {name[0]}')friends = Names()result = friends["Sachin", "Rishu", "Yashwant"]

When you run this code, there will be no errors, and the first element will be printed as specified by the __getitem__ method.

SachinName: Sachin

You can see the difference after implementing the __getitem__ method, the object (friends) is now allowed to access values using the [] operator.

Now how does __getitem__ work? First, let's understand the syntax of this magic method.

Syntax

__getitem__(self, key)

The __getitem__ method typically accepts a single argument besides self, which is commonly referred to as key when dealing with mappings like dictionaries. This argument represents the index or key used to access the value in the object.

Example

Assume you have a large number of entries with employee names and you want to retrieve the first name of each employee from them.

class EmployeeFirstName:    def __init__(self, file):        self.file = file    def __getitem__(self, name):        with open(self.file, "r") as f:            result = f.readlines()            first_name = [fname.split()[name] for fname in result]            return first_nameemployee = EmployeeFirstName("employee_data")print("List of Employees First Names:")print(employee[0])

The class accepts a file parameter, and the __getitem__ method opens and reads the file line by line, iterating through each line and splitting it into words with the split() method before selecting the word at the index name.

When the class (EmployeeFirstName) is instantiated, it receives a file (employee_data), and the square bracket notation ([]) is used on the instance (employee[0]) to access the values at the 0th index from the file.

When you run the code, you will get a list of the employees' first names from the file.

List of Employees First Names:['John', 'Cindy', 'Jacob', 'Priya', 'Ivan', 'Ji-min', 'Maria', 'Alexander', 'Li', 'Emily']

You were able to access the values because the __getitem__ method was implemented in the class and you can see that the name argument is used as a key to access values based on the index.

If you change the value of the name to 1, you will get the employees' last names.

employee = EmployeeFirstName("employee_data")print("List of Employees Last Names:")print(employee[1])

Output

List of Employees Last Names:['Doe', 'Carl', 'Dawson', 'Patel', 'Petrov', 'Kim', 'Garcia', 'Sokolov', 'Wei', 'Johnson']

Conclusion

The __getitem__ method is implemented within a Python class, allowing the class object to use square bracket notation ([]) in the same way that built-in methods do.

You can't use the [] operator on the instance without implementing the __getitem__ method.

🏆Other articles you might be interested in if you liked this one

How to implement getitem, setitem, and delitem in Python classes?

How to use map() in Python?

Serialize and deserialize Python objects using the pickle module.

Why if __name__ == __main__ is used in Python programs?

Create a WebSocket server and client in Python.

Upload and display images on the frontend using Flask.

That's all for now

Keep Coding

Python's map() Function With Examples

Sachin Pal — Tue, 23 Jan 2024 17:24:40 GMT

What would you do if you wanted to apply a function to each item in an iterable? Your first step would be to use that function by iterating over each item with the for loop.

Python has a function called map() that can help you reduce performing iteration stuff and avoid writing extra code.

Python's map() Function

The map() function in Python is a built-in function that allows you to apply a specific function to each item in an iterable without using a for loop.

Syntax

map(function, iterable)

function: map() applies this function to each item in the iterable.
iterable: list, tuple, string, or iterator object.

The map() function returns an iterator object containing the result, which can be iterated over to generate values on the fly.

Let's take a closer look at the map() function with an example.

Example

Consider the following example: a function called add_two() adds 2 to the argument a. The map() function applies the function (add_two()) to the iterable stored in the variable my_iterable.

# Function to be applieddef add_two(a):    return a + 2# List of integersmy_iterable = [34, 23, 45, 10]# Using map() functionmapping = map(add_two, my_iterable)result = list(mapping)print(f"Final result: {result}.")

When you run the above code, you'll get a list in which each element from the original list my_iterable has been incremented by 2, as specified by the add_two function.

Final result: [36, 25, 47, 12].

You may be wondering how the function is applied to each item in the iterable without the use of an explicit for loop. How does themap()function do it?

How map() Function Works?

You now know that the map() function applies a function to each item in the iterable, but how exactly does it do so?

Let me explain with an example. How would you apply a function to the items in a list without using the map() function? You'll use the for loop to iterate through each item in the list and then apply the function.

# Function to be applieddef add_two(a):    return a + 2# Empty list to store the final resultfinal_list = []# List of integersmy_iterable = [34, 23, 45, 10]# Using for loopfor item in my_iterable:    result = add_two(item)    final_list.append(result)print(f"Final result: {final_list}.")

As you can see in the above code, a for loop is used to iterate through each item in the my_iterable variable, and within the for loop, the add_two function is applied to each item, with the result appended to a list final_list and printed.

When you run the above code, you will get a similar result as in the first example.

Final result: [36, 25, 47, 12].

That is exactly how the map() function would have worked in the background for simple operations like the one above.

In fact, you can imitate the map() function in just one line of code for the above example.

# Function to be applieddef add_two(a):    return a + 2# List of integersmy_iterable = [34, 23, 45, 10]# Using list comprehensionmapping = [add_two(item) for item in my_iterable]print(f"Final result: {mapping}.")

Using the list comprehension in the above code eliminated the need to write the additional lines of code.

Processing Multiple Inputs Using map()

Consider the following scenario: you need to take multiple user inputs and process them using a function. How would you efficiently process the inputs? You can use the map() function to accomplish your goal.

# Function to convert Fahrenheit to Celciusdef to_celcius(a):    val = (a - 32) * 5/9    return round(val)# Converting each input item into integerf_values = list(map(int, input("Enter Temperature(in F): ").split()))# Applying to_celcius function to each item in the iterableresult = list(map(to_celcius, f_values))# Displaying the resultprint("Result:", result)

The code above accepts multiple user inputs, separated by a space (input().split()), and the int function will be applied to each input using map() before being converted into a list using the list function.

The next block applies the to_celcius function to the iterable f_values and prints the results.

Enter Temperature(in F): 1 100 32 273Result: [-17, 38, 0, 134]

Using map() with Multiple Iterables

So far, you've only used the map() function with a single iterable, but the map() function can accept multiple iterables.

Consider the following example, in which multiple iterables are processed using the map() function.

def add(x, y):    return x+y# Iterablesfirst = [2, 4, 6, 8]second = [1, 3, 5, 7]# Both iterables are passed to map()mapping = list(map(add, first, second))print(f"Result: {mapping}.")

In the above code, both iterables (first and second) and the add function is passed to the map() function.

The map() function iterates each element from both iterable and applies the add function. The add() function takes two arguments, x and y, and returns the sum of their values. In the first iteration, x will be 2 and y will be 1 and the output will be 3 and this will continue till the end of the shortest iterable.

Result: [3, 7, 11, 15].

Using map() with lambda Function

The lambda expression creates an anonymous function in a single line of code. Using the map() function and a lambda expression, you can achieve the desired result in a few lines of code.

first = [2, 4, 6, 8]second = [1, 3, 5, 7]# Using lambda expression with map()mapping = list(map(lambda x, y: x+y, first, second))print(f"Result: {mapping}.")

The lambda expression in the code above is the same as the previous section's add function. Using the lambda expression eliminates the need to explicitly declare a named function, saving a few lines of code.

The above code will produce the following output.

Result: [3, 7, 11, 15].

Manipulating Iterables of String

Not just integers or floats, the map() function can be used to transform iterables containing various types of elements, including strings.

words = ["welcome", "to", "geekpython"]# Converting string to uppercaseuppercase = list(map(str.upper, words))print(f"Result: {uppercase}.")# Capitalizing stringcapitalize = list(map(str.capitalize, words))print(f"Result: {capitalize}.")

The map() function is used in the above code to apply the str.upper and str.capitalize functions to the words variable (a list of strings). When you run it, you will notice that the strings are transformed.

Result: ['WELCOME', 'TO', 'GEEKPYTHON'].Result: ['Welcome', 'To', 'Geekpython'].

You can also perform various other operations by declaring a function. For instance, you can declare a function to reverse the strings or slice out some parts from the string.

words = ["welcome", "to", "geekpython"]# Reverses a stringdef reverse(string):    return string[: : -1]rev = list(map(reverse, words))print(f"Reversed: {rev}.")# Slices a stringdef custom_slice(string):    return string[1: 3: 1]slice_string = list(map(custom_slice, words))print(f"Sliced:   {slice_string}.")

Output

Reversed: ['emoclew', 'ot', 'nohtypkeeg'].Sliced:   ['el', 'o', 'ee'].

Conclusion

In simple terms, the map() function maps a function to each item in an iterable and returns an iterator object. This iterator can be looped over or converted into a data type, such as a list.

In this article, you've learned:

What is the map() function and how to use it?
How map() works and imitating the map() function using for loop and list comprehension.
Processing multiple user inputs using map().
Processing multiple iterable using map().
Using lambda expression with map() to eliminate creating a named function.
Manipulating iterable containing string items.

🏆Other articles you might be interested in if you liked this one

Serialize and deserialize Python objects using the pickle module.

How to use pytest to test your Python code?

Create a WebSocket server and client in Python.

Create multi-threaded Python programs using a threading module.

Create and integrate MySQL database with Flask app using Python.

Upload and display images on the frontend using Flask.

That's all for now

Keep Coding

pickle Module: Serializing and Deserializing Python Object

Sachin Pal — Wed, 10 Jan 2024 14:51:56 GMT

Sometimes you need to send complex data over the network, save the state of the data into a file to keep in the local disk or database, or cache the data of expensive operation, in that case, you need to serialize the data.

Python has a standard library called pickle that helps you perform the serialization and de-serialization process on the Python objects.

In this article, you'll learn about data serialization and deserialization, the pickle module's key features and how to serialize and deserialize objects, the types of objects that can and cannot be pickled, and how to modify the pickling behavior in a class.

Object Serialization

Well, serialization refers to the process of converting the data into a format that can be easily stored, transmitted, or reconstructed for later use.

Pickling is the name given to the serialization process in Python, where Python objects are converted into a byte stream. Unpickling, also known as deserializing, is the inverse operation in which byte data is converted back to its original state, reconstructing the Python object hierarchy.

The pickle Module

Pickling and unpickling are Python-specific operations that require the use of the pickle module.

The pickle module includes four functions for performing the pickling and unpickling processes on objects:

`pickle.dump(obj, file)`	`pickle.load(file)`
`pickle.dumps(obj)`	`pickle.loads(data)`

The pickle.dumps() function returns the serialized byte representation of the object.obj: The object to be serialized.file: The file or file-like object in which the serialized byte representation of the object will be written.The pickle.dump() function is used to write the serialized byte representation of the object into a specified file or file-like object.obj: The object to be serialized.The pickle.load() function reads the serialized object from the specified file or file-like object and returns the reconstructed object.file: The file or file-like object from which the serialized data is read.The pickle.loads() function returns the reconstructed object from the serialized bytes object.obj: serialized bytes object to reconstruct.

The pickle.dumps() function returns the serialized byte representation of the object.

obj: The object to be serialized.
file: The file or file-like object in which the serialized byte representation of the object will be written.

The pickle.dump() function is used to write the serialized byte representation of the object into a specified file or file-like object.

obj: The object to be serialized.

The pickle.load() function reads the serialized object from the specified file or file-like object and returns the reconstructed object.

file: The file or file-like object from which the serialized data is read.

The pickle.loads() function returns the reconstructed object from the serialized bytes object.

obj: serialized bytes object to reconstruct.

How to Pickle and Unpickle Data

Consider the following scenario: pickling the data and saving it to a file, then unpickling the serialized object from that file to reassemble it in its original form.

import pickle# Sample datamy_data = {    "lib": "pickle",    "build": 4.33,    "version": 2.1,    "status": "Active"}# Serializingwith open("lib_info.pickle", "wb") as file:    pickle.dump(my_data, file)# De-serializingwith open("lib_info.pickle", "rb") as file:    unpickled_data = pickle.load(file)print(f"Unpickled Data: {unpickled_data}")

The above code serializes the my_data dictionary and the serialized data is written to a file called lib_info.pickle in binary mode (wb).

The serialized data is then deserialized from the lib_info.pickle using the pickle.load() function.

Unpickled Data: {'lib': 'pickle', 'build': 4.33, 'version': 2.1, 'status': 'Active'}

Take a look at another example in which you have a class that contains multiple operations.

import pickleclass SampleOperation:    square = 5 ** 2    addition = 5 + 7    subtraction = 5 - 7    division = 14 / 2# Object createdmy_obj = SampleOperation()# Serializingpickled_data = pickle.dumps(my_obj)print(f"Pickled Data: {pickled_data}")# De-serializingunpickled_data = pickle.loads(pickled_data)print(f"Unpickled Data (Division): {unpickled_data.division}")print(f"Unpickled Data (Square): {unpickled_data.square}")print(f"Unpickled Data (Addition): {unpickled_data.addition}")print(f"Unpickled Data (Subtraction): {unpickled_data.subtraction}")

In the above code, an object of the SampleOperation class is created and stored in the my_obj variable.

The object my_obj is serialized using the pickle.dumps() function and the serialized data is stored in the pickled_data variable.

Then, the serialized data (pickled_data) is deserialized using the pickle.loads() function and the attributes of the unpickled object are printed.

Pickled Data: b'\x80\x04\x95#\x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x0fSampleOperation\x94\x93\x94)\x81\x94.'Unpickled Data (Division): 7.0Unpickled Data (Square): 25Unpickled Data (Addition): 12Unpickled Data (Subtraction): -2

This demonstrates that the deserialization process successfully reconstructed the object.

What Can be Pickled and Unpickled?

The pickle module can pickle a variety of objects, including strings, integers, floats, tuples, named functions, classes, and others.

However, not all types of objects are picklable. Certain types of objects, for example, file handles, sockets, database connections, and custom classes that lack necessary methods (such as __getstate__ and __setstate__), may not be picklable.

Here's an example of attempting to pickle a database connection.

import pickleimport sqlite3conn = sqlite3.connect(":memory:")# Pickling db connection objectpickle.dumps(conn)

When you run this code, you will receive a TypeError stating that the connection object cannot be pickled.

TypeError: cannot pickle 'sqlite3.Connection' object

Similarly, functions that are not defined with the def keyword, such as the lambda function, cannot be pickled using the pickle module.

import picklelambda_obj = lambda x: x ** 2pickle.dumps(lambda_obj)

The above code is attempting to pickle the lambda function object, but it will return an error.

Traceback (most recent call last):  ...    pickle.dumps(lambda_obj)_pickle.PicklingError: Can't pickle  at 0x000001EB55373E20>: attribute lookup  on __main__ failed

Modify the Pickling Behaviour of the Class

Let's say you have a class that contains different attributes and some of them are unpicklable. In that case, you can override the __getstate__ method of the class to choose what you want to pickle during the pickling process.

import pickleclass SampleTask:    def __init__(self):        self.first = 2**17        self.second = "This is a string".upper()        self.third = lambda x: x**xobj = SampleTask()pickle_instance = pickle.dumps(obj)unpickle = pickle.loads(pickle_instance)print(unpickle.__dict__)

If you directly run the above code, the result will be an error due to the lambda function defined within the class which is unpicklable.

Traceback (most recent call last):  ....    pickle_instance = pickle.dumps(obj)AttributeError: Can't pickle local object 'SampleTask.__init__..<lambda>'

To tackle this kind of situation, you can influence the pickling process of the class instance using the __getstate__ method. You can include what to pickle by overriding the __getstate__ method.

import pickleclass SampleTask:    def __init__(self):        self.first = 2**17        self.second = "This is a string".upper()        self.third = lambda x: x**x    def __getstate__(self):        state = self.__dict__.copy()        del state['third']        return stateobj = SampleTask()pickle_instance = pickle.dumps(obj)unpickle = pickle.loads(pickle_instance)print(unpickle.__dict__)

In the above example, the __getstate__ method is defined, and within this method, a copy of the attributes is made. To exclude the lambda function from the pickling process, the attribute named third is removed and then the attributes are returned.

When you run the above example, you will get the dictionary containing the results of the attributes.

{'first': 131072, 'second': 'THIS IS A STRING'}

Now if you want the excluded lambda expression to appear in the unpickled dictionary above, you can use the __setstate__ method to restore the state of the class's object.

import pickleclass SampleTask:    def __init__(self):        self.first = 2**17        self.second = "This is a string".upper()        self.third = lambda x: x**x    def __getstate__(self):        state = self.__dict__.copy()        del state['third']        return state    def __setstate__(self, state):        self.__dict__.update(state)        self.third = lambda x: x**xobj = SampleTask()pickle_instance = pickle.dumps(obj)unpickle = pickle.loads(pickle_instance)print(unpickle.__dict__)

In the above code, the __setstate__ method restores the state of the object. During unpickling, the __setstate__ method is called to restore the state of the object.

When you run the above code, you will see the dictionary having the lambda function object.

{'first': 131072, 'second': 'THIS IS A STRING', 'third': <function SampleTask.__setstate__.. at 0x000001C54EEB67A0>}

Customizing Pickling: Modifying Class Behavior for Database Connections

As you know, a variety of objects are unpicklable. Here's an example that shows how you can pickle the database connection object by modifying the pickling behavior of the class.

# pickling_db_obj.pyimport pickleimport sqlite3class DBConnection:    def __init__(self, db_name):        self.db_name = db_name        self.connection = sqlite3.connect(db_name)        self.cur = self.connection.cursor()    # Method for creating db table    def create_table(self):        self.connection.execute("CREATE TABLE IF NOT EXISTS users (name TEXT)")        return self.connection    # Method for inserting data into db table    def create_entry(self):        self.connection.execute("INSERT INTO users (name) VALUES ('Sachin')")        res = self.connection.execute("SELECT * FROM users")        result = res.fetchall()        print(result)        return self.connection    # Method for closing db connection    def close_db_connection(self):        self.cur.close()        self.connection.close()

The above code defined a class DBConnection, and the SQLite database connection is initialized within this class.

In addition, three new methods are added: create_table (for creating a database table), create_entry (for inserting and retrieving data from the table), and close_db_connection (for closing the database connection).

Now exclude the database connection from the pickling process using the __getstate__ method.

# pickling_db_obj.py...    def __getstate__(self):        state = self.__dict__.copy()        # Exclude the connection and cursor from pickling        del state['connection']        del state['cur']        return statedb_conn = DBConnection(":memory:")pickle_db_conn = pickle.dumps(db_conn)unpickle_db_conn = pickle.loads(pickle_db_conn)print(unpickle_db_conn.__dict__)

The __getstate__ method creates a copy of the object's dictionary, then removes the connection (state['connection']) and cursor (state['cur']) and returns the dictionary (state).

The DBConnection class instance is created and passed the database name (":memory:") that will be created in memory.

The database connection object is then pickled, which is then unpickled and printed.

{'db_name': ':memory:'}

As you can see, the dictionary of the object only contains the database name. The connection and cursor objects have been removed.

The __setstate__ method is now required to restore the object's original state during unpickling, in which the database connection will be reestablished.

# pickling_db_obj.py...    ...    # Restoring the original state of the object    def __setstate__(self, state):        self.__dict__.update(state)        self.connection = sqlite3.connect(self.db_name)        self.cur = self.connection.cursor()db_conn = DBConnection(":memory:")pickle_db_conn = pickle.dumps(db_conn)unpickle_db_conn = pickle.loads(pickle_db_conn)unpickle_db_conn.create_table()unpickle_db_conn.create_entry()unpickle_db_conn.close_db_connection()print(unpickle_db_conn.__dict__)

Within the __setstate__ method, the state dictionary is updated and the new database connection and the cursor are created.

To check if the pickling process works, the create_table, create_entry, and close_db_connection methods are called on the unpickled class instance (unpickle_db_conn).

When you run the whole script, you will obtain the following output.

[('Sachin',)]{'db_name': ':memory:', 'connection': 0x00000240D6F12A40>, 'cur': 0x00000240D78044C0>}

As you can see, everything went well, and the object's dictionary now has both a connection and a cursor object along with the database name, demonstrating the successful unpickling of the database connection.

Keep in mind that if the __getstate__ method returns the false value, the __setstate__ method will not be called upon unpickling. Source

While the ability to customize the __setstate__ method during unpickling provides flexibility, it also comes with security considerations. Arbitrary code can be executed during unpickling, which can be a security risk if the pickled data comes from untrusted or malicious sources.

So, what can you do to reduce the security risk? You can't do much, but you can make sure that data from untrustworthy sources isn't unpickled. Validate the authenticity of the pickled data during unpickling by using cryptographic signatures to ensure that it has not been tampered with, and if possible, sanitize the pickled data by checking for malicious content.

Conclusion

The pickle module lets you serialize and deserialize the data and now you know how to do it using the pickle module. You can now convert the object data into bytes that can be transmitted over a network or saved into disk for the future.

In this article, you've learned:

What are object serialization and deserialization
How to pickle and unpickle data using the pickle module
What type of object can be pickled
How to modify the pickling behavior of the class
How to modify the class behavior for database connection

🏆Other articles you might be interested in if you liked this one

Hash passwords using the bcrypt library in Python.

How to use pytest to test your Python code?

Create a WebSocket server and client in Python.

Create multi-threaded Python programs using a threading module.

Create and integrate MySQL database with Flask app using Python.

Upload and display images on the frontend using Flask.

That's all for now

Keep Coding

How to Use Pytest to Debug/Test Python Code

Sachin Pal — Thu, 07 Dec 2023 16:09:58 GMT

You may have done unit testing or heard the term unit test, which involves breaking down your code into smaller units and testing them to see if they are producing the correct output.

Python has a robust unit testing library called unittest that provides a comprehensive set of testing features. However, some developers believe that unittest is more verbose than other testing frameworks.

In this article, we'll look at how to use the pytest library to create small, concise test cases for your code. Throughout the process, you'll learn about the pytest library's key features.

Installation

Pytest is a third-party library that must be installed in your project environment. In your terminal window, type the following command.

pip install pytest

Pytest has been installed in your project environment, and all of its functions and classes are now available for use.

Getting Started With Pytest

Before getting into what pytest can do, let's take a look at how to use it to test the code.

Here's a Python file test_square.py that contains a square function and a test called test_answer.

# test_square.pydef square(num):    return num**2def test_answer():    assert square(3) == 10

To run the above test, simply enter the pytest command into your terminal, and the rest will be handled by the pytest library.

D:\SACHIN\Pycharm\pytestt_lib>pytest========================================= test session starts ==========================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 1 itemtest_square.py F                                                                                  [100%]=============================================== FAILURES ===============================================_____________________________________________ test_answer ______________________________________________    def test_answer():>       assert square(3) == 10E       assert 9 == 10E        +  where 9 = square(3)test_square.py:7: AssertionError======================================= short test summary info ========================================FAILED test_square.py::test_answer - assert 9 == 10========================================== 1 failed in 0.27s ===========================================

The above test failed, as evidenced by the output generated by the pytest library. You might be wondering how pytest discovered and ran the test when no arguments were passed.

This occurred because pytest uses standard test discovery. This includes the conventions that must be followed in order for testing to be successful.

When no argument is specified, pytest searches files that are in *_test.py or test_\.py* format.
Pytest collects test functions and methods that are prefixed with test, as well as test prefixed test functions and modules inside Test prefixed test classes that do not have a __init__ method, from these files.
Pytest also finds tests in subdirectories, making it simple to organize your tests within the context of your project structure.

Why do Most Prefer pytest?

If you've used the unittest library before, you'll know that even writing a small test requires more code than pytest. Here's an example to demonstrate.

Assume you want to write a unittest test suite to test your code.

# test_unittest.pyimport unittestclass TestWithUnittest(unittest.TestCase):    def test_query(self):        sentence = "Welcome to GeekPython"        self.assertTrue("P" in sentence)        self.assertFalse("e" in sentence)    def test_capitalize(self):        self.assertEqual("geek".capitalize(), "Geek")

Now, from the command line, run these tests with unittest.

D:\SACHIN\Pycharm\pytestt_lib>python -m unittest test_unittest.py.F======================================================================FAIL: test_query (test_unittest.TestWithUnittest)----------------------------------------------------------------------Traceback (most recent call last):  File "D:\SACHIN\Pycharm\pytestt_lib\test_unittest.py", line 9, in test_query    self.assertFalse("e" in sentence)AssertionError: True is not false----------------------------------------------------------------------Ran 2 tests in 0.001sFAILED (failures=1)

As you can see, the test_query test failed while the test_capitalize test passed, as expected by the code.

However, writing those tests requires more lines of code, which include:

Importing the unittest module.
A test class (TestWithUnittest) is created by subclassing TestCase.
Making assertions with unittest's assert methods (assertTrue, assertFalse, and assertEqual).

However, this is not the case with pytest, if you wrote those tests with pytest, they must look like this:

# test_pytest.pydef test_query():    sentence = "Welcome to GeekPython"    assert "P" in sentence    assert "e" not in sentencedef test_capitalize():    assert "geek".capitalize(), "Geek"

It's as simple as that, there's no need to import the package or use pre-defined assertion methods. With a detailed description, you will get a nicer output.

D:\SACHIN\Pycharm\pytestt_lib>pytest====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 2 itemstest_pytest.py F.                                                                                                                                           [100%]============================================================================ FAILURES ============================================================================ ___________________________________________________________________________ test_query ___________________________________________________________________________     def test_query():        sentence = "Welcome to GeekPython"        assert "P" in sentence>       assert "e" not in sentenceE       AssertionError: assert 'e' not in 'Welcome to GeekPython'E         'e' is contained here:E           Welcome to GeekPythonE         ?  +test_pytest.py:6: AssertionError==================================================================== short test summary info ===================================================================== FAILED test_pytest.py::test_query - AssertionError: assert 'e' not in 'Welcome to GeekPython'================================================================== 1 failed, 1 passed in 0.31s ===================================================================

The following information can be found in the output:

The platform on which the test is run, the library versions used, the root directory where the test files are stored, and the plugins used.
The Python file from the test, in this case, test_pytest.py, was collected.
The test result, which is a "F" and a dot (.). An "F" indicates a failed test, a dot (.) indicates a passed test, and a "E" indicates an unexpected condition that occurred during testing.
Finally, a test summary, which prints the results of the tests.

Parametrize Tests

What exactly is parametrization? Parametrization is the process of running multiple sets of tests on the same test function or class, each with a different set of parameters or arguments. This allows you to test the expected results of different input values.

If you want to write multiple tests to evaluate various arguments for the square function, your first thought might be to write them as follows:

# Function to return the square of specified numberdef square(num):    return num ** 2# Evaluating square of different numbersdef test_square_of_int():    assert square(5) == 25def test_square_of_float():    assert square(5.2) == 27.04def test_square_of_complex_num():    assert square(5j+5) == 50jdef test_square_of_string():    assert square("5") == "25"

But wait, there's a twist, pytest saves you from writing even more boilerplate code. To allow the parametrization of arguments for a test function, pytest provides the @pytest.mark.parametrize decorator.

Using parametrization, you can eliminate code duplication and significantly reduce your test code.

import pytestdef square(num):    return num ** 2@pytest.mark.parametrize("num, expected", [    (5, 25),    (5.2, 27.04),    (5j + 5, 50j),    ("5", "25")])def test_square(num, expected):    assert square(num) == expected

The @pytest.mark.parametrize decorator defines four different ("num, expected") tuples in the preceding code. The test_square test function will execute each of them one at a time, and the test report will be generated by determining whether the num evaluated is equal to the expected value.

Pytest Fixtures

Using pytest fixtures, you can avoid duplicating setup code across multiple tests. By defining a function with the @pytest.fixture decorator, you create a reusable setup that can be shared across multiple test functions or classes.

In testing, a fixture provides a defined, reliable, and consistent context for the tests. This could include environment (for example a database configured with known parameters) or content (such as a dataset). Source

Here's an example of when you should use fixtures. Assume you have a continuous stream of dynamic vehicle data and want to write a function collect_vehicle_number_from_delhi() to extract vehicle numbers belonging to Delhi.

# fixtures_pytest.pydef collect_vehicle_number_from_delhi(vehicle_detail):    data_collected = []    for item in vehicle_detail:        vehicle_number = item.get("vehicle_number", "")        if "DL" in vehicle_number:            data_collected.append(f"{vehicle_number}")    return data_collected

To check whether the function works properly, you would write a test that looks like the following:

# test_pytest_fixture.pyfrom fixtures_pytest import collect_vehicle_number_from_delhidef test_collect_vehicle_number_from_delhi():    vehicle_detail = [        {            "category": "Car",            "vehicle_number": "DL04R1441"        },        {            "category": "Bike",            "vehicle_number": "HR04R1441"        },        {            "category": "Car",            "vehicle_number": "DL04R1541"        }    ]    expected_result = [        "DL04R1441",        "DL04R1541"    ]    assert collect_vehicle_number_from_delhi(vehicle_detail) == expected_result

The test function test_collect_vehicle_number_from_delhi() above determines whether or not the collect_vehicle_number_from_delhi() function extracts the data as expected. Now you might want to extract the vehicle number from another state, then you will write another function collect_vehicle_number_from_haryana().

# fixtures_pytest.pydef collect_vehicle_number_from_delhi(vehicle_detail):    # Remaining codedef collect_vehicle_number_from_haryana(vehicle_detail):    data_collected = []    for item in vehicle_detail:        vehicle_number = item.get("vehicle_number", "")        if "HR" in vehicle_number:            data_collected.append(f"{vehicle_number}")    return data_collected

Following the creation of this function, you will create another test function and repeat the process.

# test_pytest_fixture.pyfrom fixtures_pytest import collect_vehicle_number_from_haryanadef test_collect_vehicle_number_from_haryana():    vehicle_detail = [        {            "category": "Car",            "vehicle_number": "DL04R1441"        },        {            "category": "Bike",            "vehicle_number": "HR04R1441"        },        {            "category": "Car",            "vehicle_number": "DL04R1541"        }    ]    expected_result = [        "HR04R1441"    ]    assert collect_vehicle_number_from_haryana(vehicle_detail) == expected_result

This is analogous to repeatedly writing the same code. To avoid writing the same code multiple times, create a function decorated with @pytest.fixture here.

# test_pytest_fixture.pyimport pytestfrom fixtures_pytest import collect_vehicle_number_from_haryanafrom fixtures_pytest import collect_vehicle_number_from_delhi@pytest.fixturedef vehicle_data():    return [        {            "category": "Car",            "vehicle_number": "DL04R1441"        },        {            "category": "Bike",            "vehicle_number": "HR04R1441"        },        {            "category": "Car",            "vehicle_number": "DL04R1541"        }    ]# test 1def test_collect_vehicle_number_from_delhi(vehicle_data):    expected_result = [        "DL04R1441",        "DL04R1541"    ]    assert collect_vehicle_number_from_delhi(vehicle_data) == expected_result# test 2def test_collect_vehicle_number_from_haryana(vehicle_data):    expected_result = [        "HR04R1441"    ]    assert collect_vehicle_number_from_haryana(vehicle_data) == expected_result

As you can see from the code above, the number of lines has been reduced to some extent, and you can now write a few more tests by reusing the @pytest.fixture decorated function vehicle_data.

Fixture for Database Connection

Consider the example of creating a database connection, in which a fixture is used to set up the resources and then tear them down.

# fixture_for_db_connection.pyimport pytestimport sqlite3@pytest.fixturedef database_connection():    # Setup Phase    conn = sqlite3.connect(":memory:")    cur = conn.cursor()    cur.execute(        "CREATE TABLE users (name TEXT)"    )    # Freeze the state and pass the object to test function    yield conn    # Teardown Phase    conn.close()

A fixture database_connection() is created, which creates an SQLite database in memory and establishes a connection, then creates a table, yields the connection, and finally closes the connection once the work is completed.

This fixture can be passed as an argument to the test function. Assume you want to write a function to insert a value into a database, simply do the following:

# fixture_for_db_connection.pydef test_insert_data(database_connection):    database_connection.execute(        "INSERT INTO users (name) VALUES ('Virat Kohli')"    )    res = database_connection.execute(        "SELECT * FROM users"    )    result = res.fetchall()    assert result is not None    assert ("Virat Kohli",) in result

The test_insert_data() test function takes the database_connection fixture as an argument, which eliminates the need to rewrite the database connection code.

You can now write as many test functions as you want without having to rewrite the database setup code.

Markers in Pytest

Pytest provides a few built-in markers to mark your test functions which can be handy while testing.

In the earlier section, you saw the parametrization of arguments using the @pytest.mark.parametrize decorator. Well, @pytest.mark.parametrize is a decorator that marks a test function for parametrization.

Skipping Tests

If you have a test function that you want to skip during testing for some reason, you can decorate it with @pytest.mark.skip.

In the test_pytest_fixture.py script, for example, you added two new test functions but want to skip testing them because you haven't yet created the collect_vehicle_number_from_punjab() and collect_vehicle_number_from_maharashtra() functions to pass these tests.

# test_pytest_fixture.py# Previous code here@pytest.mark.skip(reason="Not implemented yet")def test_collect_vehicle_number_from_punjab(vehicle_data):    expected_result = [        "PB3SQ4141"    ]    assert collect_vehicle_number_from_punjab(vehicle_data) == expected_result@pytest.mark.skip(reason="Not implemented yet")def test_collect_vehicle_number_from_maharashtra(vehicle_data):    expected_result = [        "MH05X1251"    ]    assert collect_vehicle_number_from_maharashtra(vehicle_data) == expected_result

Both test functions in this script are marked with @pytest.mark.skip and provide a reason for skipping. When you run this script, pytest will bypass these tests.

D:\SACHIN\Pycharm\pytestt_lib>pytest test_pytest_fixture.py====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 4 itemstest_pytest_feature.py ..ss                                                                                                                                 [100%]================================================================== 2 passed, 2 skipped in 0.05s ==================================================================

The report shows that two tests were passed and two were skipped.

If you want to conditionally skip a test function. In that case, use the @pytest.mark.skipif decorator to mark the test function. Here's an illustration.

# test_pytest_fixture.py# Previous code here@pytest.mark.skipif(    pytest.version_tuple < (7, 2),     reason="pytest version is less than 7.2")def test_collect_vehicle_number_from_punjab(vehicle_data):    expected_result = [        "PB3SQ4141"    ]    assert collect_vehicle_number_from_punjab(vehicle_data) == expected_result@pytest.mark.skipif(    pytest.version_tuple < (7, 2),     reason="pytest version is less than 7.2")def test_collect_vehicle_number_from_karnataka(vehicle_data):    expected_result = [        "KR3SQ4141"    ]    assert collect_vehicle_number_from_karnataka(vehicle_data) == expected_result

In this example, two test functions (test_collect_vehicle_number_from_punjab and test_collect_vehicle_number_from_karnataka) are decorated with @pytest.mark.skipif. The condition specified in each case is pytest.version_tuple < (7, 2), which means that these tests will be skipped if the installed pytest version is less than 7.2. The reason parameter provides a message explaining why the tests are being skipped.

Filter Warnings

You can add warning filters to specific test functions or classes using the @pytest.mark.filterwarnings function, allowing you to control which warnings are captured during tests.

Here's an example of the code from above.

# test_pytest_fixture.pyimport warnings# Previous code here# Helper warning functiondef warning_function():    warnings.warn("Not implemented yet", UserWarning)@pytest.mark.filterwarnings("error:Not implemented yet")def test_collect_vehicle_number_from_punjab(vehicle_data):    warning_function()    expected_result = ["PB3SQ4141"]    assert collect_vehicle_number_from_punjab(vehicle_data) == expected_result@pytest.mark.filterwarnings("error:Not implemented yet")def test_collect_vehicle_number_from_karnataka(vehicle_data):    warning_function()    expected_result = ["KR3SQ4141"]    assert collect_vehicle_number_from_karnataka(vehicle_data) == expected_result

In this example, a warning message is emitted by a helper warning function (warning_function()).

Both test functions (test_collect_vehicle_number_from_punjab and test_collect_vehicle_number_from_karnataka) are decorated with @pytest.mark.filterwarnings which specifies that any UserWarning with the message "Not implemented yet" should be treated as an error during the execution of these tests.

These test functions call warning_function which, in turn, emits a UserWarning with the specified message.

You can see in the summary of the report generated by pytest, the warning is displayed.

==================================================================== short test summary info ===================================================================== FAILED test_pytest_feature.py::test_collect_vehicle_number_from_punjab - UserWarning: Not implemented yetFAILED test_pytest_feature.py::test_collect_vehicle_number_from_karnataka - UserWarning: Not implemented yet======================================================================= 2 failed in 0.31s ========================================================================

Pytest Command-line Options

Pytest provides numerous command-line options that allow you to customize or extend the behavior of test execution. You can list all the available pytest options using the following command in your terminal.

pytest --help

Here are some pytest command-line options that you can try when you execute tests.

Running Tests Using Keyword

You can specify which tests to run by following the -k option with a keyword or expression. Assume you have the Python file test_sample.py, which contains the tests listed below.

def square(num):    return num**2# Test 1def test_special_one():    a = 2    assert square(a) == 4# Test 2def test_special_two():    x = 3    assert square(x) == 9# Test 3def test_normal_three():    x = 3    assert square(x) == 9

If you want to run tests that contains "test_special", use the following command.

D:\SACHIN\Pycharm\pytestt_lib>pytest -k test_special                ====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 3 items / 1 deselected / 2 selectedtest_sample.py ..                                                                                                                                           [100%]================================================================ 2 passed, 1 deselected in 0.07s =================================================================

The tests that have "test_special" in their name were selected, and the others were deselected.

If you want to run all other tests but not the ones with "test_special" in their names, use the following command.

D:\SACHIN\Pycharm\pytestt_lib>pytest -k "not test_special"====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 3 items / 2 deselected / 1 selectedtest_sample.py .                                                                                                                                            [100%] ================================================================ 1 passed, 2 deselected in 0.06s =================================================================

The expression "not test_special" in the above command indicates that run only those tests that don't have "test_special" in their name.

Customizing Output

You can use the following options to customize the output and the report:

-v, --verbose - Increases verbosity
--no-header - Disables header
--no-summary - Disables summary
-q, --quiet - Decreases verbosity

Output of the tests with increased verbosity.

D:\SACHIN\Pycharm\pytestt_lib>pytest -v test_sample.py ====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0 -- D:\SACHIN\Python310\python.execachedir: .pytest_cacherootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 3 items                                                                                                                                                  test_sample.py::test_special_one PASSED                                                                                                                     [ 33%] test_sample.py::test_special_two PASSED                                                                                                                     [ 66%] test_sample.py::test_normal_three PASSED                                                                                                                    [100%] ======================================================================= 3 passed in 0.04s ========================================================================

Output of the tests with decreased verbosity.

D:\SACHIN\Pycharm\pytestt_lib>pytest -q test_sample.py ...                                                                                                                                                         [100%]3 passed in 0.02s

When you use --no-header and --no-summary together, it is equivalent to using -q (decreased verbosity).

Test Collection

Using the --collect-only, --co option, pytest collects all the tests but doesn't execute them.

D:\SACHIN\Pycharm\pytestt_lib>pytest --collect-only test_sample.py           ====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 3 items      =================================================================== 3 tests collected in 0.02s ===================================================================

Ignore Path or File during Test Collection

If you don't want to collect tests from a specific path or file, use the --ignore=path option.

D:\SACHIN\Pycharm\pytestt_lib>pytest --ignore=test_sample.py====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 1 itemtest_square.py .                                                                                                                                            [100%] ======================================================================= 1 passed in 0.06s ========================================================================

The test_sample.py file is ignored by pytest during test collection in the above example.

Exit on First Failed Test or Error

When you use the -x, --exitfirst option, pytest exits the test execution on the first failed test or error that it finds.

D:\SACHIN\Pycharm\pytestt_lib>pytest -x test_sample.py====================================================================== test session starts =======================================================================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\pytestt_libplugins: anyio-3.6.2collected 3 itemstest_sample.py F============================================================================ FAILURES ============================================================================ ________________________________________________________________________ test_special_one ________________________________________________________________________     def test_special_one():        a = 2>       assert square(a) == 5E       assert 4 == 5E        +  where 4 = square(2)test_sample.py:6: AssertionError==================================================================== short test summary info ===================================================================== FAILED test_sample.py::test_special_one - assert 4 == 5!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! stopping after 1 failures !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! ======================================================================= 1 failed in 0.35s ========================================================================

Pytest immediately exits the test execution when it finds a failed test and a stopping message appears in the report summary.

Conclusion

Pytest is a testing framework that allows you to write small and readable tests to test or debug your code.

In this article, you've learned:

How to use pytest for testing your code
How to parametrize arguments to avoid code duplication
How to use fixtures in pytest
Pytest command-line options

🏆Other articles you might be interested in if you liked this one

Debug/Test your code using the unittest module in Python.

What is assert in Python and how to use it for debugging?

Create a WebSocket server and client in Python.

Create multi-threaded Python programs using a threading module.

Create and integrate MySQL database with Flask app using Python.

Upload and display images on the frontend using Flask.

That's all for now

Keep Coding

What is if name == 'main' in Python Programs

Sachin Pal — Mon, 27 Nov 2023 15:05:57 GMT

You may have seen the if __name__ == '__main__': along with some code written inside this block in Python script. Have you ever wondered what this block is, and why it is used?

In this article, you'll discover the concept of if __name__ == '__main__' block.

Understanding if name == 'main'?

Well, if __name__ == '__main__': is not some magical keyword or incantation in Python rather it is a way to ensure that specific code is executed when the module is directly executed not when it is imported as a module.

What this expression implies is that only when a certain condition is met, further action should be taken. For example, if the name of the current running module (__name__) is the same as "__main__", only the code following the if __name__ == '__main__': block is executed.

Here's a Python script in which a function is defined that returns a square of the value and a print statement that prints the square of the specified integer.

# math_module.py# Defined a Functiondef square(num):    value = num * num    return valueprint("Main Script.")result = square(2)print(f"Square of 2: {result}.")----------Main Script.Square of 2: 4.

If you run the script directly, it will produce the desired result, which is 4. The issue arises when you try to reuse the square function in another script.

A new Python script (square.py) is created, and the square function from the previous script is imported into it.

# square.py# Main Script Imported as Modulefrom math_module import squareprint("Script Imported as Module.")result = square(3)print(f"Square of 3: {result}.")

When you run the above script, the console will display the following output.

Main Script.Square of 2: 4.Script Imported as Module.Square of 3: 9.

The output shows that the code from the main script (math_module.py) is executed first, followed by the code from the currently running script.

You reused the square function, but the code inside the main script is executed automatically, which you obviously do not want. This is where you can use the if __name__ == "__main__": block functionality.

Why It Is Used?

If you add the if __name__ == "__main__": block after the square function code in the main script, the code within the if __name__ == "__main__": block will not be executed automatically when you import the script in another module.

# math_module.py# Defined a Functiondef square(num):    value = num * num    return valueif __name__ == "__main__":    print("Main Script.")    result = square(2)    print(f"Square of 2: {result}.")

The if __name__ == "__main__": block is added to the main script (math_module.py), along with some code. If you import the square function from the main script into the second script (square.py), the main script's code will no longer run automatically.

# square.py# Main Script Imported as Modulefrom math_module import squareprint("Script Imported as Module.")result = square(3)print(f"Square of 3: {result}.")

When you run the above code, you will get the following result.

Script Imported as Module.Square of 3: 9.

You may be wondering how a single line of code can restrict the execution of code within the if __name__ == "__main__": block into another script.

How Did It Happen?

Curious! How the code in the if __name__ == "__main__": block was not executed when you ran square.py.

# math_module.py# Defined a Functiondef square(num):    value = num * num    return valueprint(f"Name of the Module: {__name__}.")# Remaining code

When you run the above code, you will get the following output.

Name of the Module: __main__.

When a Python script is run, the __name__ variable is set to "__main__", indicating that the script is being run as the main program.

When a Python script is imported as a module into another script, the __name__ variable is set to the name of the script/module.

So, code within the if __name__ == "__main__": block will only be executed if the script is run directly, not when it is imported as a module. When you print the name of the square.py, you will get the following output.

# square.py# Main Script Imported as Modulefrom math_module import squareprint(f"Name of the Module: {__name__}.")# Remaining code----------Name of the Module: math_module.Name of the Module: __main__.

You can see that the name of the main script (math_module.py) changed which is why the code in the if __name__ == "__main__": block didn't get executed when the square.py script was executed.

Conclusion

In a nutshell, if __name__ == "__main__": is like a gatekeeper that ensures that a specific portion of your Python script runs only when you directly execute that script. It keeps things clean, and organized, and helps you create versatile Python modules for your projects.

The if __name__ == "__main__": can be very handy in larger projects where you have multiple scripts working together. You can import functions and classes from one script into another without running the whole program every time.

🏆Other articles you might be interested in if you liked this one

Secure user passwords by hashing using bcrypt in Python.

Create a WebSocket server and client in Python.

Create multi-threaded Python programs using a threading module.

Create and integrate MySQL database with Flask app using Python.

Upload and display images on the frontend using Flask.

Debug your code using the unittest module in Python.

That's all for now

Keep Coding

Secure Passwords by Hashing Using bcrypt Library in Python

Sachin Pal — Sun, 05 Nov 2023 06:04:09 GMT

Web-based services and websites store hashed versions of your passwords, which means your actual password isn't visible or stored in their database instead a string of fixed-length characters is stored.

Hashing is a security technique used to secure your passwords or texts stored in databases. A hash function is used to generate a string of unique fixed-length characters from the provided password by the user.

Let's see how the hashing is done. In this article, you'll use the bcrypt library to hash the user's password and then compare that hashed password to the actual password in Python. You'll also learn more about the bcrypt library.

Installing bcrypt

Open your terminal window and run the following command to install the bcrypt library using pip.

pip install bcrypt

Now that the bcrypt is installed in your system, the next step is to use it for hashing the user's password.

Hash Password using bcrypt

In this section, you'll see the functions provided by the bcrypt library that will help you generate salt and hash values.

import bcrypt# Password to Hashmy_password = b'Sachinfromgeekpython'# Generating Saltsalt = bcrypt.gensalt()# Hashing Passwordhash_password = bcrypt.hashpw(    password=my_password,    salt=salt)print(f"Actual Password: {my_password.decode('utf-8')}")# Print Hashed Passwordprint(f"Hashed Password: {hash_password.decode('utf-8')}")

The above code imports the bcrypt library for hashing the password. A test password is provided in bytes and is stored inside the my_password variable.

The code uses the gensalt() function from the bcrypt library to generate the salt, a string of characters to enhance security.

The salt is a random and unique string of characters combined with the password before hashing to provide additional security, it will always be unique, if two users have the same password, their hashed passwords will be different.

Then the actual password (my_password) and salt (salt) are passed to the hashpw() function from the bcrypt library to produce the hash value of the actual password.

Finally, the actual and hashed passwords are decoded and printed.

Actual Password: SachinfromgeekpythonHashed Password: $2b$12$RF6JLXecIE4qujuPgTwkC.GN2BsOmGf8Ji10LyquoBaHkHWUWgiAm

Check Password using bcrypt

Now that you've hashed the password, the next step is to verify the actual password's hash value against the user-provided password.

import bcrypt# Password to Hashmy_password = b'Sachinfromgeekpython'# Generating Saltsalt = bcrypt.gensalt()# Hashing Passwordhash_password = bcrypt.hashpw(    password=my_password,    salt=salt)# User-provided Passworduser_password = b'Sachinfromgeekpython'# Checking Passwordcheck = bcrypt.checkpw(    password=user_password,    hashed_password=hash_password)# This will print True or Falseprint(check)# Verifying the Passwordif check:    print("Welcome to GeekPython.")else:    print("Invalid Credential.")

The above code uses the checkpw() function from the bcrypt library to check the user-provided password against the hashed password. The hashed password (hash_password) and user-provided password (user_password) are passed inside the function and the result is stored inside the check variable.

Then the code prints the check variable to obtain the result. In the end, an if-else statement is used to verify the password.

TrueWelcome to GeekPython.

True in the output above indicates that the hashed password matches the user-provided password, making the first condition true.

Hash Password Using KDF (Key Derivation Function)

KDF (Key Derivation Function) is used to add additional security in password hashing. KDFs are used to derive keys from passwords for authentication purposes while including salt and the number of rounds.

import bcryptpassword = b'Sachinfromgeekpython'salt = bcrypt.gensalt()# Using KDF from bcrypt Libkey = bcrypt.kdf(    password=password,    salt=salt,    desired_key_bytes=32,    rounds=200)# Print Generated Keyprint(f"Key: {key}")

The above code uses the kdf() function from the bcrypt library to derive a key from the password. The function is passed with four parameters:

password: This parameter is set to the password variable which contains a byte string.
salt: This parameter is set to the salt variable that contains a unique and fixed-length salt.
desired_key_bytes: This parameter is set to 32 which is the desired length of the derived key we want. You can set it to your own desired length.
rounds: This parameter is set to 200 which is the number of iterations to make the derivation of the key more computationally intense to increase security. The higher the rounds more the security but the more it uses resources and time.

Finally, the result stored in the key variable is printed.

Key: b'\xc4#VW\x9a\x16\xdbG?\x11\xa9\xf7\xbd\x88"7+zxo\xfe@\xce\xab\x89\xc3g\x1c\xec~\xbe\xf7'

Verifying the Password with KDF

import bcryptpassword = b'Sachinfromgeekpython'salt = bcrypt.gensalt()# Using KDF from bcrypt Libkey = bcrypt.kdf(    password=password,    salt=salt,    desired_key_bytes=32,    rounds=200)# User-provided Passworduser_password = b'Sachinfromgeekpython'# Deriving Key from User-provided Passworduser_key = bcrypt.kdf(    password=user_password,    salt=salt,    desired_key_bytes=32,    rounds=200)# Verifying the Passwordif user_key == key:    print("Welcome to GeekPython.")else:    print("Invalid Credential.")

The code derives the key from the user-provided password (user_password) and stores it inside the user_key variable.

Then the code verifies the derived keys from the user-provided password (user_key) and the actual password (password).

Welcome to GeekPython.

The output indicates that the key derived from the user-provided password matches the key derived from the actual password.

Customizing Salt

The gensalt() function accepts two parameters: rounds and prefix, which allow you to customize the number of rounds of hashing to apply to the salt and prefix of the salt.

import bcrypt# Customize Saltsalt = bcrypt.gensalt(    rounds=30,    prefix=b'2a')# Print Generated Saltprint(salt.decode('utf-8'))

The above code customizes the salt generation by passing the rounds parameter which is set to 30 and the prefix parameter which is set to b'2a' to the gensalt() function.

$2a$30$5uKaXaXVceqCjmKkPf2mnu

You can notice that in the beginning after $, above provided 2a is prefixed, and just after that 30 indicates the number of rounds.

Conclusion

Password hashing prevents exposing the user's actual password to the attackers. The hash function, which is simply a mathematical function is used to produce the hash value of the password.

In this article, you've learned to hash the user's password using the bcrypt library in Python and then check the produced hash value against the user-provided password. Additionally, you've seen the KDF (Key Derivation Function) that adds additional security for hashing.

🏆Other articles you might be interested in if you liked this one

Different methods to convert bytes into a string in Python.

Create a WebSocket server and client in Python.

Create multi-threaded Python programs using a threading module.

Comparing the accuracies of 4 different pre-trained deep learning models?

Upload and display images on the frontend using Flask.

How does the learning rate affect the ML and DL models?

That's all for now

Keep Coding

How To Create a WebSocket Server and Client in Python

Sachin Pal — Thu, 19 Oct 2023 05:30:10 GMT

You must have seen real-time applications where data is changed frequently or updated in real-time, this happens because that application is using a WebSocket to achieve this functionality.

By the end of this article, you'll able to learn:

What is WebSocket?
How to create a WebSocket server and client using Python?

What is WebSocket?

A WebSocket allows two-way communication (bidirectional) between two entities over a single TCP connection. This means a WebSocket client and server can interact with each other multiple times in a single connection.

A Websocket connection is a fully-duplex connection protocol which means the data can be sent and received simultaneously in both directions, keeping the connection alive until either server or client decides to stop the connection.

It is used in real-time applications to exchange low-latency data in both directions.

How to Create a WebSocket using Python

In this section, you will create a WebSocket server and client using Python. You will use Python's websockets library to create a server and a client.

Install Dependency

Open your terminal window and run the following command:

pip install websockets

Using the websockets library, you can create a websocket server and client in Python super easily.

Create a WebSocket Server

In this section, you'll create a WebSocket server that will retrieve the values from the client and based on those values sends the appropriate response to the client.

import websocketsimport asyncio# Creating WebSocket serverasync def ws_server(websocket):    print("WebSocket: Server Started.")    try:        while True:            # Receiving values from client            name = await websocket.recv()            age = await websocket.recv()            # Prompt message when any of the field is missing            if name == "" or age == "":                print("Error Receiving Value from Client.")                break            # Printing details received by client            print("Details Received from Client:")            print(f"Name: {name}")            print(f"Age: {age}")            # Sending a response back to the client            if int(age) < 18:                await websocket.send(f"Sorry! {name}, You can't join the club.")            else:                await websocket.send(f"Welcome aboard, {name}.")    except websockets.ConnectionClosedError:        print("Internal Server Error.")async def main():    async with websockets.serve(ws_server, "localhost", 7890):        await asyncio.Future()  # run foreverif __name__ == "__main__":    asyncio.run(main())

The above code imports the websockets library for creating a WebSocket server and communicating with it and the asyncio library for making use of asynchronous tasks.

The code then defines an asynchronous function called ws_server() that takes WebSocket connection (websocket) as its parameter.

Inside the function, a try block is used that handles the incoming messages from the client. Within the try block, a while loop is created which means the code will run continuously.

Within the loop, the code receives two values from the client using the websocket.recv() and stores them within the name and age variables respectively. The code checks if any of the value is missing, if it is then it prompts a message and breaks the loops otherwise it proceeds and prints the values received from the client.

The code then sends the appropriate response back to the client based on the values it gets.

In the except block, the code handles any websockets.ConnectionClosedError exceptions that indicate an error occurred during the connection.

The code then defines another asynchronous function called main() that starts a WebSocket server. The WebSocket server is created using websockets.serve() method which listens on localhost and port 7890.

The server will run forever due to asyncio.Future(). This will keep the server alive and run continuously to listen to incoming messages.

In the end, the main() function is run using the asyncio.run(main()).

Create a WebSocket Client

In this section, you'll create a client that takes the input from the user and displays the response sent by the WebSocket server.

import websocketsimport asyncio# The main function that will handle connection and communication# with the serverasync def ws_client():    print("WebSocket: Client Connected.")    url = "ws://127.0.0.1:7890"    # Connect to the server    async with websockets.connect(url) as ws:        name = input("Your Name (type 'exit' to quit): ")        if name == 'exit':            exit()        age = input("Your Age: ")        # Send values to the server        await ws.send(f"{name}")        await ws.send(f"{age}")        # Stay alive forever, listen to incoming msgs        while True:            msg = await ws.recv()            print(msg)# Start the connectionasyncio.run(ws_client())

The above code defines an asynchronous function called ws_client(). Inside the function, the WebSocket client is connected to the WebSocket server using the websockets.connect() method passed with the URL (ws://127.0.0.1:789) of the server created in the previous section.

The user is prompted to enter the name and if they type "exit", the code quits the process. Then the user is asked to enter their age and both values (name and age) are sent to the WebSocket server using the ws.send() method.

After that, the code enters the infinite loop to continuously listen to the incoming messages from the server using the ws.recv() method. The message is then printed on the console.

Finally, the ws_client() function is run using asyncio.run(ws_client()). This will start the WebSocket client to accept the information from the user and display the response sent by the server.

Running the WebSocket Server and Client

You need to put the server and client code in two separate Python files. First, you need to run the WebSocket server script to start the server.

Open a terminal window, navigate to the project directory, and run the script file, in this case, the script is stored in the main.py Python file.

python main.py

Now open another terminal window and run the WebSocket client script, in this case, the script is stored in the client.py file.

python client.py

This will start the WebSocket client connected to the server. You will see the prompts asking for your name and age.

Here, "Sachin" and "22" are sent to the server, and in return, the server responds with the "Welcome aboard, Sachin" message on the console.

The information entered by the user will be displayed on the WebSocket server console. You can see this in the image below.

You can also run the client using the websockets interactive shell using the following command.

python -m websockets ws://localhost:7890

Conclusion

In this article, you learned to create a WebSocket server and client using the websockets library in Python. This technology is used in applications in which data changes in real-time.

🏆Other articles you might be interested in if you liked this one

Create multi-threaded Python programs using a threading module.

What is the StandardScaler function in Machine Learning?

Upload and display images on the frontend using Flask.

What is Python coroutines and how to use it to execute tasks asynchronously?

How to use async/await in Python using the asyncio module?

How to create a database on Appwrite cloud using only Python.

That's all for now

Keep Coding

How to Create and Integrate MySQL Database with the Flask App

Sachin Pal — Sun, 08 Oct 2023 15:00:00 GMT

MySQL is a widely used open-source relational database known for its performance, reliability, and scalability. It is suitable for various types of software applications, including web applications, e-commerce platforms, and content management systems.

In this article, you'll learn how to create and integrate a MySQL database with a Flask application using the PyMySQL driver, which provides convenient access to MySQL databases within the Flask framework.

Pre-Requisite

You must have Python, Flask, and MySQL installed in your system and have a basic understanding of the Flask framework, Jinja templating, and SQL.

Install PyMySQL Package

You must have the PyMySQL package installed in your project environment before proceeding with this tutorial.

To install the PyMySQL package in your local or virtual environment, open a terminal window and type the following command. It will be used to create and connect the MySQL database to your Flask app.

pip install PyMySQL

Creating MySQL Database using PyMySQL

The PyMySQL library makes it easy to interact with MySQL databases. It enables Python applications to connect to and manipulate MySQL databases. In this section, you'll use the PyMySQL library to create a MySQL database.

Unlike SQLite database, you need to set a username, password, and hostname of the MySQL database using the configuration keys.

import pymysqlhostname = 'localhost'user = 'root'password = 'your_password'# Initializing connectiondb = pymysql.connections.Connection(    host=hostname,    user=user,    password=password)# Creating cursor objectcursor = db.cursor()# Executing SQL querycursor.execute("CREATE DATABASE IF NOT EXISTS books_db")cursor.execute("SHOW DATABASES")# Displaying databasesfor databases in cursor:    print(databases)# Closing the cursor and connection to the databasecursor.close()db.close()

The pymysql library has been imported and will be used to connect to the MySQL server and create the database.

The connection to the MySQL server is initialized using the pymysql.connections.Connection object with required credentials such as hostname, username, and password. The instance is then stored inside the db variable.

The cursor object of the db is created and stored inside the cursor variable to interact with the MySQL database.

The first SQL query creates a books_db database on the MySQL server, if no such database exist.

The second SQL query uses the SHOW DATABASES statement to retrieve a list of databases on the MySQL server. This query's output is then iterated over and printed.

In the end, the cursor and database connection is closed using the cursor.close() and db.close() respectively.

('books_db',)('information_schema',)('mysql',)('performance_schema',)('sakila',)('sys',)('world',)

As can be seen in the output, a database named books_db is created on the MySQL server.

Integrating MySQL Database with Flask App

In this section, you'll learn to integrate MySQL database with your Flask app and create a model using SQLAlchemy.

Flask App and SQLALchemy Setup

from flask import Flask, render_template, request, redirect, url_forfrom flask_sqlalchemy import SQLAlchemy# Creating Flask appapp = Flask(__name__)# Creating SQLAlchemy instancedb = SQLAlchemy()

The above code imports the Flask class as well as the flask module's render_template, request, redirect, and url_for methods and functions. The SQLAlchemy is then imported from the flask_sqlalchemy module.

You need to install the Flask-SQLAlchemy library if you haven't installed already.
Run pip install Flask-SQLAlchemy command in your terminal to install this dependency.

The Flask app is created by instantiating the Flask class with the value __name__ and storing it in the app variable.

The SQLAlchemy (SQLAlchemy()) instance is created and stored inside the db variable.

Configuring MySQL Database Connection URI

user = "root"pin = "your_password"host = "localhost"db_name = "books_db"# Configuring database URIapp.config['SQLALCHEMY_DATABASE_URI'] = f"mysql+pymysql://{user}:{pin}@{host}/{db_name}"# Disable modification trackingapp.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False

The database connection parameters are defined first. These variables store the database connection parameters, such as MySQL username (user), MySQL password (pin), hostname (host), and database name (db_name) to connect to.

The following line of code sets up the SQLAlchemy database URI to connect to the MySQL database. The URI is broken down as follows:

mysql+pymysql://: This means that the PyMySQL driver is to be used.
username: This is to specify the username and this will be replaced with the value of the user variable.
password: This will be replaced with the value of the pin variable.
hostname: This will be replaced with the value of the host variable.
database_name: This will be replaced with the value of the db_name variable.

In the end, the modification tracking is disabled using the app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False configuration.

Initializing Flask App with SQLAlchemy

# Initializing Flask app with SQLAlchemydb.init_app(app)

The SQLAlchemy instance (db) is initialized using the init_app() method with the Flask app (app).

Creating a Database Model

# Creating Modelsclass Books(db.Model):    __tablename__ = "books"    id = db.Column(db.Integer, primary_key=True)    title = db.Column(db.String(500), nullable=False, unique=True)    author = db.Column(db.String(500), nullable=False)

The code creates a SQLAlchemy model class (Books) that represents a database table. The __tablename__ specifies the name of the table (books).

Within the table, three columns are created: id, title, and author. The db.Column function is used to define database columns.

id: The id column is of type integer (db.Integer) and is a primary key (primary_key=True) which means it will be unique for each row.
title: It is of type string (db.String) and it cannot be left empty (nullable=False) and it will contain unique values (unique=True).
author: It is of type string (db.String) and cannot be left empty (nullable=False).

Creating Table in MySQL Database

def create_db():    with app.app_context():        db.create_all()

The above code defines the function create_db(), which is used to create a table (books) within the MySQL database (books_db). Flask app (app) is run within the application context within this function to ensure that it has access to the Flask app and its configuration.

Within the context block, create_all() method is called on the SQLAlchemy instance (db) to create the table defined by the SQLAlchemy model class (Books).

Database Operation - Adding Data Using Flask App

Now that the MySQL database and table have been created, it is time to see if they work and if data can be added to the database. You can do this manually with MySQL workbench or other applications, but you'll do it programmatically with the Flask framework on the frontend.

Creating Backend Logic

# Home route@app.route("/")def home():    details = Books.query.all()    return render_template("home.html", details=details)# Add data route@app.route("/add", methods=['GET', 'POST'])def add_books():    if request.method == 'POST':        book_title = request.form.get('title')        book_author = request.form.get('author')        add_detail = Books(            title=book_title,            author=book_author        )        db.session.add(add_detail)        db.session.commit()        return redirect(url_for('home'))    return render_template("books.html")

The above code defines two routes "Home" and "Add data" route for displaying and adding the books detail.

Home Route:

The user can access the Home route by using the root URL "/", and when the user accesses the root URL, the home() function is executed.
Using Books.query.all(), the home() function retrieves all records from the database table.
The details variable is then used to pass all of the records to the home.html template, and the template is then rendered.

Add data Route:

Users can add data to the database table by visiting the "/add" URL path, which accepts both GET and POST requests.
When users enter the URL path or make a GET request, the books.html template is rendered, which includes a form with two fields.
The add_books() function is defined, and this function handles the user's POST request.
The request.form.get() method is used to retrieve the value of the title and author from the form, which is then passed to the database table fields (title and author).
Then, using the db.session.add() method, all the details are added to the database, and the changes are committed to the MySQL database using the db.session.commit() method.
When all of this is completed, users are redirected to the home page via the url_for() method.

Running the Script

if __name__ == "__main__":    create_db()    app.run(debug=True)

The above code will call the create_db() function, which will create the table within the database and the app.run() will launch the Flask app on the localhost server.

However, first, you must create the HTML template for the frontend.

Creating Frontend using BootStrap

Create a directory named templates in the root of your project directory and within this directory create three HTML files:

base.html
home.html
books.html

Using the Jinja templating, the data from the database will be displayed.

base.html

html><html lang="en">  <head>        <meta charset="utf-8">    <meta name="viewport" content="width=device-width, initial-scale=1">        <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3" crossorigin="anonymous">    <title>{% block title%} {% endblock %}title>  head>  <body>    {% block content %} {% endblock %}    <script src="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-ka7Sk0Gln4gmtz2MlQnikT1wXgYsOg+OMhuP+IlRH9sENBO0LRn5q+8nbTov4+1p" crossorigin="anonymous">script>  body>html>

This HTML file contains the Bootstrap CSS and JavaScript CDN (Content Delivery Network). You can see some Jinja templating, this is for injecting content in the body section and dynamic title without using the basic HTML structure again.

home.html

{% extends 'base.html' %}{% block title %}Home of Books{% endblock %}{% block content %}<h1 class="text-center my-5">📚Books Detail📚h1><div class="container d-flex justify-content-center align-items-center">    <a class="btn btn-outline-info mb-3" href="{{ url_for('add_books') }}">Add Booksa>div><div class="container">    <table class="table table-dark table-striped">        <thead>        <tr>            <th scope="col">IDth>            <th scope="col">Book Titleth>            <th scope="col">Authorth>        tr>        thead>        {% if not details%}        <div class="text-center">            <h3 class="my-5">No Records to Display!h3>        div>        {% else %}        <tbody>        {% for data in details %}        <tr>            <th scope="row">{{data.id}}th>            <td>{{data.title}}td>            <td>{{data.author}}td>        tr>        {% endfor %}        tbody>        {% endif %}    table>div>{% endblock %}

In this template, base.html template is used as a layout using {% extends 'base.html' %} and page title is set with the title block and within the content block, data stored in the database (details) is iterated and displayed in the table format using the Bootstrap.

books.html

{% extends 'base.html' %}{% block title %}Add Books{% endblock %}{% block content %}<h1 class="text-center my-5">📚Book Details📚h1><div class="container">  <a href="{{ url_for('home') }}" class="btn mb-3 btn-outline-info">Go to Homea>  <form action="/add" method="POST"><div class="mb-3">  <label for="title" class="form-label">Titlelabel>  <input type="text" class="form-control" name="title" id="title" placeholder="Title of the book" required>div><div class="mb-3">  <label for="author" class="form-label">Authorlabel>  <input type="text" class="form-control" name="author" id="author" placeholder="Author of the book">div>  <button type="submit" class="btn mt-3 btn-outline-success">Add Bookbutton>  form>  div>{% endblock %}

This HTML file contains a form with two input fields (Title and Author) and a submit button ("Add Book"). This form will be submitted on the "/add" URL (action="/add") and "method="POST"" specifies that POST HTTP method to be used.

Adding Data Using Frontend

Run your app.py file, or whatever you named your main Flask app file, and navigate to http://localhost:5000 to access the frontend.

You will see a homepage with no records displayed, as shown in the image below. You must first enter the data on the "/add" URL or by clicking the "Add Book" button on the homepage.

This is the page where you will enter your book information and submit the form. The submitted information will be entered into the database.

After clicking the "Add Book" button, you'll be taken to the homepage, where your entered data will be displayed.

Conclusion

The PyMySQL driver is used in this tutorial to connect to the MySQL server and create the MySQL database after connecting to the MySQL server by executing the raw SQL query.

The Flask app then defines the database connection URI string, and SQLAlchemy is initialized with the Flask app.

The SQLAlchemy is then used to create a table within the database in an object-oriented way using Python class. The backend is designed to handle database operations, while the frontend is designed to add data to the MySQL database and display it on the homepage.

🏆Other articles you might be interested in if you liked this one

How to create and connect SQLite database within Flask app.

Structure your Flask app with Flask Blueprint.

Upload and display images on the frontend using Flask.

What is session in Flask and how to use it for storing temporary details on the server?

How to display user-specific messages on the frontend using flash in Flask.

How to create a database on Appwrite cloud using only Python.

That's all for now

Keep Coding

What is Threading and How to Create Threads in Python

Sachin Pal — Wed, 04 Oct 2023 05:30:09 GMT

You may have heard the terms "parallelization" or "concurrency", which refer to scheduling tasks to run parallelly or concurrently (at the same time) to save time and resources. This is a common practice in asynchronous programming, where coroutines are used to execute tasks concurrently.

Threading in Python is used to run multiple tasks at the same time, hence saving time and resources and increasing efficiency.

Although multi-threading can save time and resources by executing multiple tasks at the same time, using it in code can lead to safety and reliability issues.

In this article, you'll learn what is threading in Python and how you can use it to make multiple tasks run concurrently.

What is Threading?

Threading, as previously stated, refers to the concurrent execution of multiple tasks in a single process. This is accomplished by utilizing Python's threading module.

Threads are smaller units of the program that run concurrently and share the same memory space.

How to Create Threads and Execute Concurrently

Python provides a module called threading that provides a high-level threading interface to create and manage threads in Python programs.

Create and Start Thread

A thread can be created using the Thread class provided by the threading module. Using this class, you can create an instance of the Thread and then start it using the .start() method.

import threading# Creating Target Functiondef num_gen(num):    for n in range(num):        print("Thread: ", n)# Main Code of the Programif __name__ == "__main__":    print("Statement: Creating and Starting a Thread.")    thread = threading.Thread(target=num_gen, args=(3,))    thread.start()    print("Statement: Thread Execution Finished.")

A thread is created by instantiating the Thread class with a target parameter that takes a callable object in this case, the num_gen function, and an args parameter that accepts a list or tuple of arguments, in this case, 3.

This means that you are telling Thread to run the num_gen() function and pass 3 as an argument.

If you run the code, you'll get the following output:

Statement: Creating and Starting a Thread.Statement: Thread Execution Finished.Thread:  0Thread:  1Thread:  2

You can notice that the Statement section of the code has finished before the Thread did. Why does this happen?

The thread starts executing concurrently with the main program and the main program does not wait for the thread to finish before continuing its execution. That's why the above code resulted in executing the print statement before the thread was finished.

To understand this, you need to understand the execution flow of the program:

First, the "Statement: Creating and Starting a Thread." print statement is executed.
Then the thread is created and started using thread.start().
The thread starts executing concurrently with the main program.
The "Statement: Thread Execution Finished." print statement is executed by the main program.
The thread continues and prints the output.

The thread and the main program run independently that's why their execution order is not fixed.

join() Method - The Saviour

Seeing the above situation, you might have thought then how to suspend the execution of the main program until the thread is finished executing.

Well, the join() method is used in that situation, it doesn't let execute the code further until the current thread terminates.

import threading# Creating Target Functiondef num_gen(num):    for n in range(num):        print("Thread: ", n)# Main Code of the Programif __name__ == "__main__":    print("Statement: Creating and Starting a Thread.")    thread = threading.Thread(target=num_gen, args=(3,))    thread.start()    thread.join()    print("Statement: Thread Execution Finished.")

After creating and starting a thread, the join() method is called on the Thread instance (thread). Now run the code, and you'll get the following output.

Statement: Creating and Starting a Thread.Thread:  0Thread:  1Thread:  2Statement: Thread Execution Finished.

As can be seen, the "Statement: Thread Execution Finished." print statement is executed after the thread terminates.

Daemon Threads

Daemon threads run in the background and terminate immediately whether they completed the work or not when the main program exits.

You can make a daemon thread by passing the daemon parameter when instantiating the Thread class. You can pass a boolean value to indicate whether the thread is a daemon (True) or not (False).

import threadingimport timedef daemon_thread():    while True:        print("Daemon thread is running.")        time.sleep(1)        print("Daemon thread finished executing.")if __name__ == "__main__":    thread1 = threading.Thread(target=daemon_thread, daemon=True)    thread1.start()    print("Main program exiting.")

A thread is created by instantiating the Thread class passing the daemon_thread function inside it and to mark it as a daemon thread, the daemon parameter is set to True.

The daemon_thread() function is an infinite loop that prints a statement, sleeps for one second, and then again prints a statement.

Now when you run the above code, you'll get the following output.

Daemon thread is running.Main program exiting.

You can see that as soon as the main program exits, the daemon thread terminates.

At the time when the daemon_thread() function enters the loop, the concurrently running main program exits, and the daemon_thread() function never reaches the next print statement as can be seen in the output.

threading.Lock - Avoiding Race Conditions

Threads, as you know, run concurrently in a program. If your program has multiple threads, they may share the same resources or the critical section of the code at the same time, this type of condition is called race conditions.

This is where the Lock comes into play, it acts like a synchronization barrier that prevents multiple threads from accessing the particular code or resources simultaneously.

The thread calls the acquire() method to acquire the Lock and the release() method to release the Lock.

import threading# Creating Lock instancelock = threading.Lock()data = ""def read_file():    global data    with open("sample.txt", "r") as file:        for info in file:            data += "\n" + infodef lock_task():    lock.acquire()    read_file()    lock.release()if __name__ == "__main__":    thread1 = threading.Thread(target=lock_task)    thread2 = threading.Thread(target=lock_task)    thread1.start()    thread2.start()    thread1.join()    thread2.join()    # Printing the data read from the file    print(f"Data: {data}")

First, a Lock is created using the threading.Lock() and store it inside the lock variable.

An empty string is created (data) for storing the information from both threads concurrently.

The read_file() function is created that reads the information from the sample.txt file and adds it to the data.

The lock_task() function is created and when it is called, the following events occur:

The lock.acquire() method will acquire the Lock immediately when the lock_task() function is called.
If the Lock is available, the program will execute the read_file() function.
After the read_file() function finished executing, the lock.release() method will release the Lock to make it available again for other threads.

Within the if __name__ == "__main__" block, two threads are created thread1 and thread2 that both runs the lock_task() function.

Both threads run concurrently and attempt to access and execute the read_file() function at the same time but only one thread can access and enter the read_file() at a time due to the Lock.

The main program waits for both threads to execute completely because of thread1.join() and thread2.join().

Then using the print statement, the information present in the file is printed.

Data: Hello there! Welcome to GeekPython.Hello there! Welcome to GeekPython.

As can be seen in the output, one thread at a time reads the file. However, there were two threads that's why the file was read two times, first by thread1 and then by thread2.

Semaphore Objects in Threading

Semaphore allows you to limit the number of threads that you want to access the shared resources simultaneously. Semaphore has two methods:

acquire(): Thread can acquire the semaphore if it is available. When a thread acquires a semaphore, the semaphore's count decrement if it is greater than zero. If the count is zero, the thread waits until the semaphore is available.
release(): After using the resources, the thread releases the semaphore that results in an increment in the count. This means that shared resources are available.

Semaphore is used to limit access to shared resources, preventing resource exhaustion and ensuring controlled access to resources with limited capacity.

import threading# Creating a semaphoresem = threading.Semaphore(2)def thread_task(num):    print(f"Thread {num}: Waiting")    # Acquire the semaphore    sem.acquire()    print(f"Thread {num}: Acquired the semaphore")    # Simulate some work    for _ in range(5):        print(f"Thread {num}: In process")    # Release the semaphore when done    sem.release()    print(f"Thread {num}: Released the semaphore.")if __name__ == "__main__":    thread1 = threading.Thread(target=thread_task, args=(1,))    thread2 = threading.Thread(target=thread_task, args=(2,))    thread3 = threading.Thread(target=thread_task, args=(3,))    thread1.start()    thread2.start()    thread3.start()    thread1.join()    thread2.join()    thread3.join()    print("All threads have finished.")

In the above code, Semaphore is instantiated with the integer value of 2 which means two threads are allowed to run at the same time.

Three threads are created and all of them use the thread_task() function. But only two threads are allowed to run at the same time, so two threads will access and enter the thread_task() function at the same time, and when any of the threads releases the semaphore, the third thread will acquire the semaphore.

Thread 1: WaitingThread 1: Acquired the semaphoreThread 1: In processThread 1: In processThread 1: In processThread 1: In processThread 1: In processThread 2: WaitingThread 2: Acquired the semaphoreThread 1: Released the semaphore.Thread 2: In processThread 2: In processThread 3: WaitingThread 2: In processThread 3: Acquired the semaphoreThread 3: In processThread 2: In processThread 2: In processThread 2: Released the semaphore.Thread 3: In processThread 3: In processThread 3: In processThread 3: In processThread 3: Released the semaphore.All threads have finished.

Using ThreadPoolExecutor to Execute Tasks from a Pool of Worker Threads

The ThreadPoolExecutor is a part of concurrent.features module that is used to execute multiple tasks concurrently. Using ThreadPoolExecutor, you can run multiple tasks or functions concurrently without having to manually create and manage threads.

from concurrent.futures import ThreadPoolExecutor# Creating pool of 4 threadsexecutor = ThreadPoolExecutor(max_workers=4)# Function to evaluate square numberdef square_num(num):    print(f"Square of {num}: {num * num}.")task1 = executor.submit(square_num, 5)task2 = executor.submit(square_num, 2)task3 = executor.submit(square_num, 55)task5 = executor.submit(square_num, 4)# Wait for tasks to complete and then shutdownexecutor.shutdown()

The above code creates a ThreadPoolExecutor with a maximum of 4 worker threads which means the thread pool can have a maximum of 4 worker threads executing the tasks concurrently.

Four tasks are submitted to the ThreadPoolExecutor using the submit method with the square_num() function and various arguments. This will execute the function with specified arguments and prints the output.

In the end, the shutdown method is called, so that ThreadPoolExecutor shutdowns after the tasks are completed and resources are freed.

You don't have to explicitly call the shutdown method if you create ThreadPoolExecutor using the with statement.

from concurrent.futures import ThreadPoolExecutor# Taskdef square_num(num):    print(f"Square of {num}: {num * num}.")# Using ThreadPoolExecutor as context managerwith ThreadPoolExecutor(max_workers=4) as executor:    task1 = executor.submit(square_num, 5)    task2 = executor.submit(square_num, 2)    task3 = executor.submit(square_num, 55)    task5 = executor.submit(square_num, 4)

In the above code, the ThreadPoolExecutor is used with the with statement. When the with block is exited, the ThreadPoolExecutor is automatically shut down and its resources are released.

Both codes will produce the same result.

Square of 5: 25.Square of 2: 4.Square of 55: 3025.Square of 4: 16.

Common Function in Threading

The threading module provides numerous functions and some of them are explained below.

Getting Main and Current Thread

The threading module has a main_thread() and a current_thread() function which is used to get the main thread and the currently running thread respectively.

import threadingdef task():    for _ in range(2):        # Getting the current thread name        print(f"Current Thread: {threading.current_thread().name} is running.")# Getting the main thread nameprint(f"Main thread   : {threading.main_thread().name} started.")thread1 = threading.Thread(target=task)thread2 = threading.Thread(target=task)thread1.start()thread2.start()thread1.join()thread2.join()print(f"Main thread   : {threading.main_thread().name} finished.")

Because the main_thread() and current_thread() functions return a Thread object, threading.main_thread().name is used to get the name of the main thread and threading.current_thread().name is used to get the name of the current thread.

Main thread   : MainThread started.Current Thread: Thread-1 (task) is running.Current Thread: Thread-1 (task) is running.Current Thread: Thread-2 (task) is running.Current Thread: Thread-2 (task) is running.Main thread   : MainThread finished.

Monitoring Currently Active Threads

The threading.enumerate() function is used to return the list of Thread objects that are currently running. This includes the main thread even if it is terminated and excludes terminated threads and threads that have not started yet.

If you want to get the number of Thread objects that are currently alive, you can utilize the threading.active_count() function.

import threadingdef task():    print(f"Current Thread     : {threading.current_thread().name} is running.")# Getting the main thread nameprint(f"Main thread        : {threading.main_thread().name} started.")threads_list = []for _ in range(5):    thread = threading.Thread(target=task)    thread.start()    threads_list.append(thread)    # Getting the active thread count    print(f"\nActive Thread Count: {threading.active_count()}")for thread in threads_list:    thread.join()print(f"Main thread        : {threading.main_thread().name} finished.")# Getting the active thread countprint(f"Active Thread Count: {threading.active_count()}")# Getting the list of active threadsfor active in threading.enumerate():    print(f"Active Thread List: {active.name}")

Output

Main thread        : MainThread started.Current Thread     : Thread-1 (task) is running.Active Thread Count: 2Current Thread     : Thread-2 (task) is running.Active Thread Count: 2Current Thread     : Thread-3 (task) is running.Active Thread Count: 2Current Thread     : Thread-4 (task) is running.Active Thread Count: 2Current Thread     : Thread-5 (task) is running.Active Thread Count: 1Main thread        : MainThread finished.Active Thread Count: 1Active Thread List: MainThread

Getting Thread Id

import threadingimport timedef task():    print(f"Thread {threading.get_ident()} is running.")    time.sleep(1)    print(f"Thread {threading.get_ident()} is terminated.")print(f"Main thread started.")threads_list = []for _ in range(5):    thread = threading.Thread(target=task)    thread.start()    threads_list.append(thread)for thread in threads_list:    thread.join()print(f"Main thread finished.")

Every thread running in a process is assigned an identifier and the threading.get_ident() function is used to retrieve the identifier of the currently running thread.

Main thread started.Thread 9824 is running.Thread 7188 is running.Thread 4616 is running.Thread 3264 is running.Thread 7716 is running.Thread 7716 is terminated.Thread 9824 is terminated.Thread 7188 is terminated.Thread 4616 is terminated.Thread 3264 is terminated.Main thread finished.

Conclusion

A thread is a smaller unit in the program that is created using the threading module in Python. Threads are tasks or functions that you can use multiple times in your program to execute concurrently to save time and resources.

In this article, you've learned:

What is threading and how do you create and start a thread
Why join() method is used
What are daemon threads and how to create one
How to Lock threads to avoid race conditions
How semaphore is used to limit the number of threads that can access the shared resources at the same time.
How you can execute a group of tasks using the ThreadPoolExecutor without having to create threads.
Some common functions provided by the threading module.

🏆Other articles you might be interested in if you liked this one

Comparing the accuracy of 4 pre-trained deep learning models?

What are coroutines in Python and how do use them in asynchronous programming?

Async/Await in Python using the asyncio module?

How to structure a Flask app using Flask Blueprint?

Upload and display images on the frontend using Flask in Python.

How to connect the SQLite database with the Flask app using Python?

That's all for now

Keep Coding

Practical Examination: Impact of Learning Rate on ML and DL Model's Performance

Sachin Pal — Sun, 01 Oct 2023 05:30:13 GMT

In this tutorial, you'll look at how learning rate affects ML (Linear Regression Model) and DL (Neural Networks) models, as well as which adaptive learning rate methods best optimize neural networks in deep learning.

What is the Learning Rate?

Learning rate is a hyperparameter that tunes the step size of the model's weights during each iteration of the optimization process. The learning rate is used in optimization algorithms like SGD (Stochastic Gradient Descent) to minimize the loss function that enhances the model's performance.

The step size, determined by the learning rate, decides how much the model's weights are updated in each iteration towards the gradient of the loss function. The learning rate can be dynamically adjusted during training to help the model reach the best possible performance.

Impact of Learning Rate on Model's Performance

A higher learning rate causes the model's weights to take larger steps on each iteration towards the gradient of the loss function. While this can lead to faster convergence, it can also result in instability and poorer performance.

In the case of a lower learning rate, the model's weights are updated by small steps causing slower convergence towards the optimal performance. Although it takes more time to train, it often offers greater stability and a better chance of reaching an optimal performance.

Practical Example

This section will cover the practical examples of using learning rate to understand its impact on the machine and deep learning models. Additionally, you'll see how adaptive learning rate methods help in optimizing the deep learning model's performance.

Impact of Learning Rate on Machine Learning Model

In this section, you'll use the SGDRegressor model provided by Scikit Learn that provides leverage to use the learning rate parameter.

Grab the dataset from here.

Importing Required Modules

import pandas as pdimport matplotlib.pyplot as pltfrom sklearn.model_selection import train_test_splitfrom sklearn.linear_model import SGDRegressorfrom sklearn.metrics import mean_squared_error

The pandas library will be used for reading the dataset whereas matplotlib.pyplot will be used for plotting the learning curves.

The train_test_split function from the Scikit-Learn library will be employed to split the dataset into training and testing batches. Additionally, the mean_squared_error function from Scikit-Learn's metrics module will be used to calculate the differences between the actual and predicted values.

The SGDRegressor class will be used for fitting the linear regression model that uses stochastic gradient descent (SGD) for optimization.

Loading the Data

# Load the datadf = pd.read_csv("D:/SACHIN/Jupyter/LRImpact/bmi_data.csv")# Separating Feature and Target VariablesX = df[['Height', 'Weight', 'Gender']]y = df['Index']

The data is read from the specified CSV file using the pd.read_csv() function, and the resulting DataFrame is stored inside the df variable.

The features of the dataset are separated and stored inside the X variable. Similarly, the target variable, denoted as "Index", is separated from the dataset and stored inside the y variable.

Preparing Data for Training

# Preparing Data for trainingX_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)# List of Learning rateslearning_rates = [0.0001, 0.001, 0.01, 0.1, 0.2, 0.5, 0.8, 1]# Empty List for storing errorstrain_errors_list = []  # To store train errorstest_errors_list = []  # To store test errors

The data is split into training and testing sets, specifically into X_train, X_test, y_train, and y_test using the train_test_split() function, where both features (X) and the target (y) are provided as input. The test_size parameter is set to 30% of the dataset, and the random_state is set to 42 for reproducibility.

A list of learning rates is defined, the model will run for each value in the learning_rates list, and two empty lists are created named train_errors_list (store training errors) and test_errors_list (store test errors).

Training Model with SGDRegressor

for i in range(len(learning_rates)):    model = SGDRegressor(learning_rate='constant', eta0=learning_rates[i], random_state=42)    train_errors = []    test_errors = []    for _ in range(200):        model.partial_fit(X_train, y_train)        y_train_pred = model.predict(X_train)        y_test_pred = model.predict(X_test)        train_error = mean_squared_error(y_train, y_train_pred)        test_error = mean_squared_error(y_test, y_test_pred)        train_errors.append(train_error)        test_errors.append(test_error)    train_errors_list.append(train_errors)    test_errors_list.append(test_errors)

A loop is initiated to iterate over the list of learning rates. This loop will run for the same number of iterations as there are learning rates in the list.

Inside the loop, the SGDRegressor model is instantiated with specific parameters: learning_rate is set to 'constant' (which is the default setting), and eta0 is set to learning_rates[i], signifying that the learning rate varies for each iteration.

Two empty lists, train_errors and test_errors, are created to collect the training and testing errors, respectively.

Within the loop, another loop is created to run the model for 200 epochs or iterations.

During each epoch, the model is updated with training data in small batches using the partial_fit method, which represents incremental learning.

Following the update, the model's predictions for both the training and testing data are stored in y_train_pred and y_test_pred, respectively.

The mean squared error is then computed between the actual target values (y_train and y_test) and the predicted target values (y_train_pred and y_test_pred) to evaluate how well the model fits the data.

The training and testing errors for each epoch are stored in the train_errors and test_errors lists, respectively.

Upon completing 200 epochs for each learning rate, the accumulated training and testing errors are appended to the train_errors_list and test_errors_list, respectively.

Plotting Learning Curves

# Plot learning curves for each learning ratefor i in range(len(learning_rates)):    plt.subplot(4, 2, (i + 1))    plt.plot(train_errors_list[i], label='train')    plt.plot(test_errors_list[i], label='test')    plt.legend()    plt.suptitle('Impact of Learning Rates on Linear Regression')    plt.title('LR = ' + str(learning_rates[i]))plt.show()

To plot the learning curves for each learning rate, a loop is created to iterate over the length of learning rates from the list learning_rates.

Within the loop, the grid of subplots is created with 4 rows and 2 columns using plt.subplot(4, 2, (i + 1), and i+1 is used to specify the index number of the subplot.

Two lines are plotted within each subplot: train_errors_list[i] (training errors for a specific learning rate) and test_errors_list[i] (testing errors for a specific learning rate).

Each subplot's legend and title are customized, and the supertitle is customized for the entire grid of subplots.

Finally, the plt.show() function is used to display the entire grid of subplots.

Result

The model with learning rates of 0.001 and 0.01 performed well, but the model with a learning rate of 0.1 performed significantly better. Despite the slower convergence, the stability is excellent.

You can experiment with various values, such as changing the learning rates and increasing or decreasing the number of epochs.

Note: Your results may vary, you can try running the model 2-3 times for the average outcome.

Impact of Learning Rate on Deep Learning Model

In this section, you will examine the effect of learning rates on neural networks using Keras' SGD optimizer, as well as the effect of momentum on performance by accelerating the training process. Then you'll see which optimizer optimizes the neural networks the best.

Importing Required Modules

import pandas as pdimport matplotlib.pyplot as pltfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import Densefrom tensorflow.keras.optimizers import SGDfrom tensorflow.keras.utils import to_categorical

The Sequential class will be used to stack neural network layers, the Dense layer will be used to add fully connected layers to the neural network architecture, the SGD optimizer will be essential for applying learning rate and momentum to the neural network, and to_categorical will be used for one-hot encoding.

Loading the Data

# Reading datadf = pd.read_csv("D:/SACHIN/Jupyter/LRImpact/bmi_data.csv")X = df[['Height', 'Weight', 'Gender']]y = df['Index']

The data is read and the resulting DataFrame is stored inside the df variable.

The features of the dataset are separated and stored inside the X variable. Similarly, the target variable, denoted as "Index", is separated from the dataset and stored inside the y variable.

Preparing the Data

# One-hot Encoding the Target Variabley = to_categorical(y)# Preparing Data for Trainingparam = 300X_train, X_test = X[:param], X[param:]y_train, y_test = y[:param], y[param:]

First, The target variable y undergoes one-hot encoding using the to_categorical() function to convert it into a 2D array of binary vectors suitable for multi-class classification problems.

The dataset is divided into two batches: training and testing. The split point is determined by the param variable.

The X_train variable contains samples from start (index=0) to 299 (index=299) and the X_test variable holds samples from 300 up to the end. The same is done with the y_train and y_test variables.

# Empty list for storing the training data accuracyacc = []# Empty list for storing the validation data accuracyval_acc = []# List of learning rateslearning_rates = [1, 0.1, 0.01, 0.001, 0.0001, 0.00001, 0.000001, 0.0000001]

Two empty lists are created: acc will collect accuracy values from the training data and val_acc will store accuracy values from the validation data.

The learning_rates variable defines a list of various learning rates.

Training the Neural Networks

for i in range(len(learning_rates)):    model = Sequential()    model.add(Dense(128, input_shape=(3, ), activation='relu'))    model.add(Dense(64, activation='relu'))    model.add(Dense(6, activation='softmax'))    optimize = SGD(learning_rate=learning_rates[i])    model.compile(optimizer=optimize, loss='categorical_crossentropy', metrics=['accuracy'])    result = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=200, verbose=0)    acc.append(result.history['accuracy'])    val_acc.append(result.history['val_accuracy'])

The loop iterates over the learning rates, and for each learning rate, a neural network model is created and trained.

Inside the loop, the instance of the Sequential is created and stored inside the model variable.

Following that, the neural network has one input layer and one hidden layer with 128 and 64 nodes, respectively, using the ReLU activation function. The output layer has 6 nodes (output classes) and uses the softmax activation function.

The instance of SGD optimizer is created in which the learning_rate parameter is set to learning_rates[i] which means the learning rate changes for each iteration, allowing you to test different learning rates.

The model is compiled with the specified optimizer (optimize), loss function (categorical_crossentropy for multi-class classification), and the metric to be monitored during training (accuracy).

The model is trained using X_train and y_train for 200 epochs (epochs=200) and validated on X_test and y_test. The training results are stored in the result variable.

The accuracy values for both training and validation data are collected for each learning rate. Training accuracy is stored in the acc list, and validation accuracy is stored in the val_acc list using result.history['accuracy'] and result.history['val_accuracy'].

Plotting

# Plot learning curves for each learning ratefor i in range(len(learning_rates)):    plt.subplot(4, 2, (i + 1))    plt.plot(acc[i], label='train')    plt.plot(val_acc[i], label='test')    plt.legend()    plt.title('LR = ' + str(learning_rates[i]))    plt.suptitle("Impact of LR on Multi-class Classification Model")plt.show()

For each learning rate, a grid of subplots (4 rows and 2 columns) is created and within each subplot, two lines will be plotted acc[i] (training data accuracy for a specific learning rate) and val_acc[i] (validation data accuracy for a specific learning rate).

Finally, the entire grid of subplots is displayed using the plt.show() function.

Result

You can see that the model did well for the learning rates of 0.001, 0.0001, and 0.00001. The convergence is slow and takes more training time but the model has shown great stability and performance.

Note: Your results may vary, you can try running the model 2-3 times for the average outcome.

Using Momentum

The SGD optimizer offers a momentum parameter that, when utilized, can help accelerate the training process by allowing the optimizer to build up velocity and smoothen the progression during optimization.

import pandas as pdimport matplotlib.pyplot as pltfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import Densefrom tensorflow.keras.optimizers import SGDfrom tensorflow.keras.utils import to_categorical# Reading datadf = pd.read_csv("D:/SACHIN/Jupyter/LRImpact/bmi_data.csv")X = df[['Height', 'Weight', 'Gender']]y = df['Index']# One-hot Encoding the Target Variabley = to_categorical(y)# Preparing Data for Trainingparam = 300X_train, X_test = X[:param], X[param:]y_train, y_test = y[:param], y[param:]# Empty list for storing the training data accuracyacc = []# Empty list for storing the validation data accuracyval_acc = []# List of Momentummom = [0.0, 0.1, 0.5, 0.9]for i in range(len(mom)):    model = Sequential()    model.add(Dense(128, input_shape=(3,), activation='relu'))    model.add(Dense(64, activation='relu'))    model.add(Dense(6, activation='softmax'))    optimize = SGD(learning_rate=0.001, momentum=mom[i])    model.compile(optimizer=optimize, loss='categorical_crossentropy', metrics=['accuracy'])    result = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=200, verbose=0)    acc.append(result.history['accuracy'])    val_acc.append(result.history['val_accuracy'])# Plot learning curves for each momentum valuefor i in range(len(mom)):    plt.subplot(2, 2, (i + 1))    plt.plot(acc[i], label='train')    plt.plot(val_acc[i], label='test')    plt.legend()    plt.title('Momentum = ' + str(mom[i]))    plt.suptitle("Impact of Momentum")plt.show()

In the above code, a learning rate of 0.001 is taken. This learning rate has shown reasonable performance in previous experiments when training neural networks.

Different values of momentum are stored in the mom list. The code iteratively applies these momentum values to the model.

The highlighted line of code initializes an instance of the SGD optimizer. It configures the optimizer with a fixed learning rate of 0.001 and a momentum value specified by mom[i]. This means that for each iteration, the momentum value changes according to the elements of the mom list.

The code creates a grid of subplots with 2 rows and 2 columns to visualize the learning curves for different momentum values.

Finally, learning curves are plotted for each momentum value, showing both training and validation accuracy over epochs.

Result

You can see that the momentum accelerated the training process of the model and a momentum value of 0.9 achieved approximately maximum train and test accuracy within 70 or fewer epochs and after that, the learning curve stayed almost constant.

Various Optimizers for Model Optimization

In this section, you will see how various optimizers impact the training process of the model with their default setting.

Keras provides a variety of optimizers that uses stochastic gradient descent algorithm with adaptive learning rates. The following optimizers will be used for optimizing the model's training process.

Code

import pandas as pdimport matplotlib.pyplot as pltfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import Densefrom tensorflow.keras.utils import to_categorical# Reading datadf = pd.read_csv("D:/SACHIN/Jupyter/LRImpact/bmi_data.csv")X = df[['Height', 'Weight', 'Gender']]y = df['Index']# One-hot Encoding the Target Variabley = to_categorical(y)# Preparing Data for Trainingparam = 300X_train, X_test = X[:param], X[param:]y_train, y_test = y[:param], y[param:]# Empty list for storing the training data accuracyacc = []# Empty list for storing the validation data accuracyval_acc = []# List of Optimizersoptimizers = ['adamax', 'adam', 'adagrad', 'rmsprop']for i in range(len(optimizers)):    model = Sequential()    model.add(Dense(128, input_shape=(3,), activation='relu'))    model.add(Dense(64, activation='relu'))    model.add(Dense(6, activation='softmax'))    optimize = optimizers[i]    model.compile(optimizer=optimize, loss='categorical_crossentropy', metrics=['accuracy'])    result = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=200, verbose=0)    acc.append(result.history['accuracy'])    val_acc.append(result.history['val_accuracy'])# Plot learning curves for different optimizersfor i in range(len(optimizers)):    plt.subplot(2, 2, (i + 1))    plt.plot(acc[i], label='train')    plt.plot(val_acc[i], label='test')    plt.legend()    plt.title('Optimizer = ' + str(optimizers[i]))    plt.suptitle("Impact of Optimizer")plt.show()

Result

The 'adamax' and 'adam' exhibit nearly identical performance when converging towards the optimal state, 'adagrad' learned the problem within 60 epochs and then did not converge further. In the case of 'rmsprop', you can see excessive fluctuations in testing accuracy compared to training accuracy while converging, it learned the problem but requires more epochs.

Conclusion

In this tutorial, you've learned how different learning rates affect machine learning and neural network models. You first explored this with Scikit-learn's SGDRegressor model and then with Keras's SGD optimizer for a multi-class classification problem. Additionally, you saw how momentum can speed up training.

In the final part, you tested various optimizers such as Adamax, Adam, Adagrad, and RMSprop with neural networks to find out which one works best.

🏆Other articles you might be interested in if you liked this one

How to build a custom deep learning model using the transfer learning technique?

How to build a Flask image recognition app using a deep learning model?

What are Sessions and how to use them in a Flask app as temporary storage?

How to structure a flask app using Flask Blueprint?

Upload and display images on the frontend using Flask in Python.

How to connect the SQLite database with the Flask app using Python?

That's all for now

Keep Coding

How to Find and Delete Duplicate Rows from Dataset Using pandas

Sachin Pal — Thu, 28 Sep 2023 05:30:10 GMT

Data preprocessing is an essential part of machine learning in terms of data analysis and building a robust machine learning model. A well processed and clean data can make a difference.

When working with multiple datasets and attempting to merge them, it's common to encounter issues such as missing values, data type mismatches, duplicate data, and more.

In this article, you will see how duplicate data can be easily removed using the pandas library.

Sample Dataset

In this tutorial, you will learn how to effectively remove duplicate rows from a car dataset. You'll explore the step-by-step process of identifying and eliminating duplicate entries, ensuring that your data remains clean and accurate for analysis.

import pandas as pddf = pd.read_csv('car.csv')df.head(10)df.shape

Here are the first 10 entries from the car dataset.

print(f"The Shape of the Dataset: {df.shape}")

Output

The Shape of the Dataset: (4340, 8)

You can see that the dataset contains 4340 rows and 8 columns.

Removing Duplicate Entries from the Dataset

In this section, you'll learn how to use the pandas library to find and remove duplicate entries from a dataset.

Finding Duplicate Rows

The first step is to identify or find the rows that share the same data in the dataset.

The DataFrame.duplicated() function can be used to identify the duplicate rows in a dataset. The DataFrame.duplicated() function returns a boolean Series where each row is marked as True if it is a duplicate and False if it is not a duplicate.

In simple words, The DataFrame.duplicated() function checks each row in a DataFrame and compares it to the previous rows. When it finds a row with the same values as a previous row, it marks that row as True in the resulting boolean Series.

# Identifying the duplicated rowsdup = df.duplicated()dup.head(20)

The above code will identify the duplicated rows in the specified dataset. It prints the first 20 rows of duplicate data.

0     False1     False2     False3     False4     False5     False6     False7     False8     False9     False10    False11    False12    False13     True14     True15     True16     True17     True18     True19     True20     Truedtype: bool

Syntax

DataFrame.duplicated(subset=None, keep='first')

Parameters:

subset: To identify duplicate values on certain columns. This considers all the columns by default.

keep: This parameter determines which duplicates to mark

first: Mark duplicates as True except for the first occurrence.
last: Mark duplicates as True except for the last occurrence.
False: Mark all duplicates as True including the first and last occurrence.

Now let's determine the number of duplicate rows in the car dataset.

# Determing the number of duplicate rowsdup = df.duplicated(keep='first')duplicate_rows = df[dup]duplicate_count = len(duplicate_rows)print(f"Number of duplicate rows: {duplicate_count}")

The above code will mark the duplicate rows as True except for the first occurrence and then all the duplicate rows will be stored inside the duplicate_rows variable using a df[dup].

The length of the duplicate rows is then calculated and printed using len(duplicate_rows).

Number of duplicate rows: 763

You can see that 763 rows are marked as duplicates. The number of duplicate rows will differ from the above result if you use keep=False.

To see which rows are duplicates, you can use the variable duplicate_rows to print them.

duplicate_rows.head(10)

Deleting the Duplicate Rows

Now the second step is to delete the rows identified as duplicates from the dataset.

To carry out this operation, you can use the drop_duplicates() function provided by the pandas library. This function removes the duplicate rows and returns the DataFrame.

Syntax

DataFrame.drop_duplicates(subset=None, keep='first', inplace=False, ignore_index=False)

Parameters:

subset: To identify duplicate values on certain columns. This considers all the columns by default.

keep: This parameter determines which duplicates to keep

first: Delete duplicates except for the first occurrence.
last: Delete duplicates except for the last occurrence.
False: Delete all duplicates including both first and last occurrence.

inplace: Whether to modify the dataset. If set to True, the DataFrame will be modified in place, and no new DataFrame is returned.

ignore_index: If set to True, it will reset the index of the resulting DataFrame.

Here's how you can delete the duplicate rows after identifying them from the dataset.

data = df.drop_duplicates()print(f"The Shape of the Dataset: {data.shape}")

The above code will delete all the duplicate rows keeping the first occurrence of each duplicated row. The shape of the DataFrame is printed after duplicate rows have been removed.

The Shape of the Dataset: (3577, 8)

You can notice that the shape of the DataFrame has changed. After removing the duplicated data the DataFrame now has 3577 rows.

If you don't want to keep any duplicate entries, including the first and last occurrence, you can set the keep parameter to False in the drop_duplicates() function.

import pandas as pddf = pd.read_csv("car.csv")print(df.shape)dup = df.duplicated(keep=False)duplicate_rows = df[dup]duplicate_row_count = len(duplicate_rows)print(f"Number of Duplicate Rows: {duplicate_row_count}")data = df.drop_duplicates(keep=False)print(f"Shape of Dataset: {data.shape}")

The above code will mark all duplicates as True and then delete them.

(4340, 8)Number of Duplicate Rows: 1289Shape of Dataset: (3051, 8)

When the keep parameter is set to False, the number of duplicate rows changes from the previous count. The DataFrame now has 3051 rows after all duplicates have been removed.

Conclusion

Data cleaning is essential for data analysis and data modeling. While performing data preprocessing, you might encounter duplicate data and this data is redundant. Duplicate data can produce biased results, skew statistical analyses, and lead to incorrect conclusions.

Duplicate data can be removed from the DataFrame using the drop_duplicates() function provided by the pandas library.

In this article, you've seen the step-by-step guide to identifying duplicate data from the DataFrame and later removing them.

Resources

🏆Other articles you might be interested in if you liked this one

How to join, combine, and merge two different datasets using pandas?

How do learning rates impact the performance of the ML and DL models?

How to build a custom deep learning model using transfer learning?

How to standardize data using StandardSaler()?

How to perform data augmentation for deep learning using Keras?

Upload and display images on the frontend using Flask in Python.

That's all for now

Keep Coding

How to Use StandardScaler() to Standardize the Data

Sachin Pal — Mon, 25 Sep 2023 05:30:09 GMT

Ensuring consistency in the numerical input data is crucial to enhancing the performance of machine learning algorithms. To achieve this uniformity, it is necessary to adjust the data to a standardized range.

Standardization and Normalization are both widely used techniques for adjusting data before feeding it into machine learning models.

In this article, you will learn how to utilize the StandardScaler class to scale the input data.

What is Standardization?

Before diving into the fundamentals of the StandardScaler class, you need to understand the standardization of the data.

Standardization is a data preparation method that involves adjusting the input (features) by first centering them (subtracting the mean from each data point) and then dividing them by the standard deviation, resulting in the data having a mean of 0 and a standard deviation of 1.

The formula for standardization can be written like the following:

standardized_val = ( input_value - mean ) / standard_deviation

Assume you have a mean value of 10.4 and a standard deviation value of 4. To standardize the value of 15.9, put the given values into the equation as follows:

standardized_val = ( 15.9 - 10.4 ) / 3
standardized_val = ( 5.5 ) / 4
standardized_val = 1.37

The StandardScaler stands out as a widely used tool for implementing data standardization.

What is StandardScaler?

The StandardScaler class provided by Scikit Learn applies the standardization on the input (features) variable, making sure they have a mean of approximately 0 and a standard deviation of approximately 1.

It adjusts the data to have a standardized distribution, making it suitable for modeling and ensuring that no single feature disproportionately influences the algorithm due to differences in scale.

Why Bother Using it?

Well, so far you've already understood the idea of using StandardScaler in machine learning but just to highlight, here are the primary reasons why you should use StandardScaler:

For the betterment of the performance of the machine learning models
Maintains the consistency of data points
Useful when working with machine learning algorithms that can be negatively influenced by differences in the scale of the features of the data.

How to Use StandardScaler?

First, you should bring in the StandardScaler class from the sklearn.preprocessing module. After that, create an instance of the StandardScaler class by using StandardScaler(). Following that, apply the fit_transform method to the input data by fitting it to the created instance.

# Imported required libsimport numpy as npfrom sklearn.preprocessing import StandardScaler# Creating a 2D arrayarr = np.asarray([[12, 0.007],                 [45, 1.5],                 [75, 2.005],                 [7, 0.8],                 [15, 0.045]])print("Original Array: \n", arr)# Instance of StandardScaler classscaler = StandardScaler()# Fitting and then transforming the input dataarr_scaled = scaler.fit_transform(arr)print("Scaled Array: \n", arr_scaled)

An instance of the StandardScaler class is created and stored in the variable scaler. This instance will be used to standardize the data.

The fit_transform method of the StandardScaler object (scaler) is called with the original data arr as the input.

The fit_transform method will compute the mean and deviation for each data point in the input data arr and then apply the standardization to the input data.

Here's the original array and the standardized version of the original array.

Original Array:  [[1.200e+01 7.000e-03] [4.500e+01 1.500e+00] [7.500e+01 2.005e+00] [7.000e+00 8.000e-01] [1.500e+01 4.500e-02]]Scaled Array:  [[-0.72905466 -1.09507083] [ 0.55066894  0.79634605] [ 1.71405403  1.43610862] [-0.92295217 -0.09045356] [-0.61271615 -1.04693028]]

Does Standardization Affect the Accuracy of the Model?

In this section, you'll see how the model's performance is affected after applying standardization to features of the dataset.

Let's see how the model will perform on the raw dataset without standardizing the feature variables.

# Evaluate KNN on the breast cancer datasetfrom sklearn import datasetsfrom sklearn.model_selection import cross_val_scorefrom sklearn.neighbors import KNeighborsClassifierfrom numpy import mean# load datasetdf = datasets.load_breast_cancer()X = df.datay = df.target# Instantiating the modelmodel = KNeighborsClassifier()# Evaluating the modelscores = cross_val_score(model, X, y, scoring='accuracy', cv=10, n_jobs=-1)# Model's average scoreprint(f'Accuracy: {mean(scores):.2f}')

The breast cancer dataset is loaded from the sklearn.datasets and then the features (df.data) and target (df.target) are stored inside the X and y variables.

The K-nearest neighbors classifier (KNN) model is instantiated using the KNeighborsClassifier class and stored inside the model variable.

The cross_val_score function is used to evaluate the KNN model's performance. It passes the model (KNeighborsClassifier()), features (X), target (y), and specifies that accuracy (scoring='accuracy') should be used as the evaluation metric.

This will evaluate the accuracy scores by dividing the dataset equally into 10 parts (cv=10) which means the dataset will be trained and tested 10 times. Here, n_jobs=-1 means using all the available CPU cores for faster cross-validation.

Finally, the average of the accuracy scores (mean(scores)) is printed.

Accuracy: 0.93

Without standardizing the dataset's feature variables, the average accuracy score is 93%.

Using StandardScaler for Applying Standardization

# Evaluate KNN on the breast cancer datasetfrom sklearn import datasetsfrom sklearn.model_selection import cross_val_scorefrom sklearn.neighbors import KNeighborsClassifierfrom sklearn.preprocessing import StandardScalerfrom numpy import mean# loading dataset and configuring features and target variablesdf = datasets.load_breast_cancer()X = df.datay = df.target# Standardizing featuresscaler = StandardScaler()X_scaled = scaler.fit_transform(X)# Instantiating modelmodel = KNeighborsClassifier()# Evaluating the modelscores = cross_val_score(model, X_scaled, y, scoring='accuracy', cv=10, n_jobs=-1)# Model's average scoreprint(f'Accuracy: {mean(scores):.2f}')

The dataset's features undergo scaling with the StandardScaler(), and the resulting scaled dataset is stored in the X_scaled variable.

Next, this scaled dataset is used as input for the cross_val_score function to compute and subsequently display the accuracy.

Accuracy: 0.97

It is noticeable that the accuracy score has significantly increased to 97% when compared to the previous accuracy score of 93%.

The application of StandardScaler(), which standardized the data's features, has notably improved the model's performance.

Conclusion

StandardScaler is used to standardize the input data in a way that ensures that the data points have a balanced scale, which is crucial for machine learning algorithms, especially those that are sensitive to differences in feature scales.

Standardization transforms the data such that the mean of each feature becomes zero (centered at zero), and the standard deviation becomes one.

Let's recall what you've learned:

What actually is StandardScaler
What is standardization and how it is applied to the data points
Impact of StandardScaler on the model's performance

🏆Other articles you might be interested in if you liked this one

How do learning rates impact the performance of the ML and DL models?

How to build a custom deep learning model using transfer learning?

How to build a Flask image recognition app using a deep learning model?

How to join, combine, and merge two different datasets using pandas?

How to perform data augmentation for deep learning using Keras?

Upload and display images on the frontend using Flask in Python.

What are Sessions and how to use them in a Flask app as temporary storage?

That's all for now

Keep Coding

Comparing the Accuracy of 4 Commonly Used Models in Transfer Learning

Sachin Pal — Sat, 23 Sep 2023 05:30:09 GMT

There are deep learning models that are pre-trained on millions of image data. These models reduce the effort to train the custom deep learning model from scratch, you need to fine-tune them and they are ready to be trained on your dataset.

Keras provides a high-level API for using pre-trained models. You can easily load these models with their pre-trained weights and adapt them to your specific tasks by adding custom classification layers on top of the pre-trained layers. This allows you to perform transfer learning efficiently.

In this article, you'll see which of the four commonly used pre-trained model (VGG, Inception, Xception, and ResNet) is more accurate with their default settings. You'll train these models on the image dataset and at the end you will able to conclude which model performed the best.

Model Name	Top-1 Accuracy	Top-5 Accuracy	Parameters
VGG16	71.3%	90.1%	138.4M
Xception	79.0%	94.5%	22.9M
ResNet50	74.9%	92.1%	25.6M
InceptionV3	77.9%	93.7%	23.9M

You can see that the Xception has the highest top-1 and top-5 accuracy among these models and it is trained on 22.9 million data points.

Evaluating the Accuracy

In this section, you'll train the aforementioned models on the cats and dogs image dataset and evaluate the accuracy.

Importing Required Libs and Modules

import numpy as npfrom tensorflow.keras.models import Sequentialfrom tensorflow.keras.layers import Dense, Flatten, GlobalAveragePooling2D, Dropoutfrom tensorflow.keras.applications import Xception, InceptionV3, VGG16, ResNet50from tensorflow.keras.preprocessing.image import ImageDataGeneratorfrom sklearn.metrics import accuracy_score

All the modules and classes that will be required throughout this article are imported.

Sequential will be used to lay the architecture of the model.
Dense layer will be used to add the fully connected layer, Flatten will be used for flattening the layer from 2D to 1D, GlobalAveragePooling2D will be used to reduce the dimensions of neural networks, and Dropout will be used to regulate the neurons by randomly shutting the neurons to avoid overfitting.
Pre-trained models (Xception, InceptionV3, VGG16, and ResNet50) are imported from the tensorflow.keras.applications module.
ImageDataGenerator will be used for preprocessing the images.
accuracy_score will be used to determine the accuracy of the models on the test dataset.

Train, Test & Valid Directory Paths

# Specifying train, valid and test directory pathstrain_path = "D:/SACHIN/Jupyter/Transfer Learning Model Comparison/cats_and_dogs/train"valid_path = "D:/SACHIN/Jupyter/Transfer Learning Model Comparison/cats_and_dogs/valid"test_path = "D:/SACHIN/Jupyter/Transfer Learning Model Comparison/cats_and_dogs/test"

The dataset is divided into three directories: train, valid, and test, with two subdirectories within each: cat and dog.

The full path to each of these directories is specified: train_path holds training data, valid_path holds validation data, and test_path holds testing data.

Preprocessing Image Data

# Preprocessing imagestrain_batches = ImageDataGenerator(rescale=1.0 / 255.0)valid_batches = ImageDataGenerator(rescale=1.0 / 255.0)test_batches = ImageDataGenerator(rescale=1.0 / 255.0)

The ImageDataGenerator is used to preprocess images for training, validation, and testing datasets.

The images will be rescaled to have pixels between 0 and 1. More specifically, resacle=1.0/255.0 means that each input image's pixel value will be divided by 255 to scale down the pixel values.

Setting Up Data Generators

# Train data generatortrain_gen = train_batches.flow_from_directory(    directory=train_path,     target_size=(224, 224),     classes=['cat', 'dog'],     batch_size=10)# Valid data generatorvalid_gen = valid_batches.flow_from_directory(    directory=valid_path,     target_size=(224, 224),     classes=['cat', 'dog'],     batch_size=10)# Test data generatortest_gen = test_batches.flow_from_directory(    directory=test_path,     target_size=(224, 224),     classes=['cat', 'dog'],     batch_size=10,    shuffle=False)

This step involves creating data generators for training, validation, and testing data.

The first line of code uses the flow_from_directory() method to generate training data with the following parameters:

directory=train_path: Training directory path from where the images will be loaded.
traget_size=(224,224): This resizes the image dimensions into 224x224 pixels.
classes=['cat', 'dog']: Defines the classes for classification.
batch_size=10: The generator will produce batches of 10 images at a time.

Similarly, this process will be repeated for validation and testing data. However, because the shuffle parameter is set to False, testing data will not be shuffled.

Found 1000 images belonging to 2 classes.Found 200 images belonging to 2 classes.Found 100 images belonging to 2 classes.

As you can see, 1000 images belong to 2 classes i.e., "cat" and "dog" (500 images for each class) found in the train directory. The valid and test directories found 200 and 100 images respectively.

Initializing Models

# Initializing modelsmodel_vgg = VGG16(include_top=False, input_shape=(224, 224, 3)) # include_top=False: To not include first 3 fully connected layersmodel_xception = Xception(include_top=False, input_shape=(224, 224, 3)) # include_top=False: To not include fully connected layers at the topmodel_inception = InceptionV3(include_top=False, input_shape=(224, 224, 3)) # include_top=False: To not include fully connected layers at the topmodel_resnet = ResNet50(include_top=False, input_shape=(224, 224, 3)) # include_top=False: To not include fully connected layers at the top

The pre-trained models are initialized (VGG16(), Xception(), InceptionV3(), and ResNet50()) with the parameters are as follows:

include_top=False: This means that the fully connected layers will not be included at the top of the neural networks.
input_shape=(224, 224, 3): The pre-trained models are initialized with the input shape of the images set to 224x224 with the RGB (3) color channel.

Freezing the Layers

# Freezing the layers so that they cannot be trained againnames = [model_vgg, model_xception, model_inception, model_resnet]for model in names:    # Iterating all the layers in the pre-trained model    for layer in model.layers:        # Making trainable layers set to False        layer.trainable = False

The above code loops through the list of pre-trained models stored in the names variable, and then the inner loop iterates through the layers of each pre-trained model, freezing them by setting the trainable layer to False (layer.trainable = False).

This step is included because the layers have already been trained and it is pointless to train them further during fine-tuning. That is the entire purpose of employing the transfer learning technique.

Fine-tuning the Pre-trained Models

# Fine-tuning the pre-trained modelsoutput_classes = len(train_gen.class_indices)# Custom VGG16 modelcustom_vgg_model = Sequential([    model_vgg,    Flatten(),    Dense(256, activation='relu'),    Dropout(0.5),    Dense(output_classes, activation='softmax')])# Custom Xception modelcustom_xc_model = Sequential([    model_xception,    GlobalAveragePooling2D(),    Dense(output_classes, activation='softmax')])# Custom Inception modelcustom_inc_model = Sequential([    model_inception,    GlobalAveragePooling2D(),    Dense(output_classes, activation='softmax')])# Custom ResNet modelcustom_resnet_model = Sequential([    model_resnet,    GlobalAveragePooling2D(),    Dense(output_classes, activation='softmax')])

The number of classes present in the training dataset is determined using the len(train_gen.class_indices) and stored inside the output_classes variable.

A new Sequential model (custom_vgg_model) is constructed with the pre-trained VGG16 model and custom layers are added to it. The layers are flattened to a 1D vector using the Flatten() layer followed by a fully connected (Dense) layer with 256 neurons and an activation function "ReLu" is added. Then a Dropout layer with a rate of 0.5 (50%) is added regulating the neurons in the neural network and finally, an output layer (Dense) is added with output_classes and activation function "Softmax".

A custom model custom_xc_model is constructed with a pre-trained Xception model (model_xception). A GlobalAveragePooling2D layer is added to reduce the dimensionality of the model and an output layer (Dense) is added with output_classes and activation function "Softmax".

Similarly, custom_inc_model and custom_resnet_model are constructed with pre-trained InceptionV3 (model_inception) and ResNet50 (model_resnet) models respectively.

Compiling Models with Optimizer, Metrics and Loss

# Compiling the modelmodels = [custom_vgg_model, custom_xc_model, custom_inc_model, custom_resnet_model]for model in models:    model.compile(        optimizer='adam',        loss='categorical_crossentropy',        metrics=['accuracy']    )

A list called models is defined that contains a collection of fine-tuned models.

A loop is created that iterates through the models in the list, and each model is compiled with an optimizer named "Adam", the loss is computed using 'categorical_crossentropy', and the 'accuracy' is used as metrics to measure the accuracy.

Training the Models

# Training the modelmodel_names = [(custom_vgg_model, "VGG16"), (custom_xc_model, "Xception"), (custom_inc_model, "InceptionV3"), (custom_resnet_model, "ResNet50")]for model, model_name in model_names:    print(f">>> Training {model_name} model:")    model.fit(        train_gen,        validation_data=valid_gen,        epochs=10,        verbose=0    )    print(f">>> Evaluating {model_name} on the test set:")    test_pred = model.predict(test_gen)    test_labels = test_gen.classes    test_accuracy = accuracy_score(np.argmax(test_pred, axis=1), test_labels)    print(f">>> Test Accuracy for {model_name}: {test_accuracy * 100:.2f}%:")

A list (model_names) is created in which each item is a pair that consists of a fine-tuned model (e.g. custom_vgg_model) and model name (e.g. "VGG16").

A loop is created that iterates through the fine-tuned model and associated model name from the list model_names. The model.fit() method is called to train each model with the following arguments:

train_gen: Training data generator that will provide batches of training data on which the model is going to be trained.
validation_data=valid_gen: Validation data generator that will be used to check the performance of the model on the validation set without actually training on it.
epochs=10: The model will train for 10 epochs which means the model will train the entire training set 10 times.
verbose=0: This will not show any progression on the screen.

Then the model predicts on the test_gen data using the model.predict() function and store the results inside the test_pred.

Following that, the actual test labels are collected using the test_gen.classes and stored inside the test_labels variable. This will be used to compare against the predicted labels.

In the final step, the accuracy of the models is evaluated using the accuracy_score. The accuracy is determined by comparing the true test labels (test_labels) with predicted test labels (np.argmax(test_pred, axis=1)).

Here, np.argmax(test_pred, axis=1) is used to pick out the max element from the array (test_pred). Using this, you get the array of class labels 0 and 1.

Result

>>> Training VGG16 model:>>> Evaluating VGG16 on the Test data:10/10 [==============================] - 19s 2s/step>>> Test Accuracy for VGG16: 91.00%.>>> Training Xception model:>>> Evaluating Xception on the Test data:10/10 [==============================] - 1s 44ms/step>>> Test Accuracy for Xception: 100.00%.>>> Training InceptionV3 model:>>> Evaluating InceptionV3 on the Test data:10/10 [==============================] - 2s 55ms/step>>> Test Accuracy for InceptionV3: 99.00%.>>> Training ResNet50 model:>>> Evaluating ResNet50 on the Test data:10/10 [==============================] - 2s 45ms/step>>> Test Accuracy for ResNet50: 62.00%.

When you run each model for 10 epochs on the training dataset, the following accuracy on the test data is obtained:

VGG16: 91%
Xception: 100%
InceptionV3: 99%
ResNet50: 62%

As can be seen, the Xception model performed exceptionally well and achieved 100% test accuracy, whereas the ResNet50 model struggled to learn and achieved 62% test accuracy.

You can try out different pre-trained models and datasets. During fine-tuning, you can also experiment with increasing the number of epochs or adding more fully connected models.

Please keep in mind that your results may vary depending on the various factors.

Visualizing Train and Valid Accuracy

You can determine the learning curve for each model by visualizing the training and validation accuracy for each epoch.

Adjust the above code as shown below:

# Training the modelmodel_histories = [] # To store train and valid accuraciesmodel_names = [(custom_vgg_model, "VGG16"), (custom_xc_model, "Xception"), (custom_inc_model, "InceptionV3"), (custom_resnet_model, "ResNet50")]for model, model_name in model_names:    print(f">>> Training {model_name} model:")    result = model.fit(        train_gen,        validation_data=valid_gen,        epochs=10,        verbose=0    )    model_histories.append((result.history, model_name))    print(f">>> Evaluating {model_name} on the Test data:")    test_pred = model.predict(test_gen)    test_labels = test_gen.classes    test_accuracy = accuracy_score(np.argmax(test_pred, axis=1), test_labels)    print(f">>> Test Accuracy for {model_name}: {test_accuracy * 100:.2f}%.")# Plot learning curves for each modelplt.figure(figsize=(12, 6))for i, (history, model_name) in enumerate(model_histories):    plt.subplot(2, 2, i + 1)    plt.plot(history['accuracy'], label='Train Accuracy')    plt.plot(history['val_accuracy'], label='Valid Accuracy')    plt.legend()    plt.title(f'Model Name = {model_name}')plt.suptitle("Model Performance")plt.tight_layout()plt.show()

The above code will produce the following image showing the plots for each model that contains two lines: training accuracy and validation accuracy:

Using the plot above, you can examine the training and validation accuracy of each model. As can be seen, the InceptionV3 model achieved 100% test accuracy around the 5th epoch, while the Xception model achieved 100% test accuracy around the 7th epoch.

Conclusion

In this tutorial, you've seen commonly used (VGG, Xception, Inception, and ResNet) pre-trained models are fine-tuned and then trained on the image dataset to see their performance on the test data.

🏆Other articles you might be interested in if you liked this one

How do learning rates impact the performance of the ML and DL models?

How to build a custom deep learning model using transfer learning?

How to build a Flask image recognition app using a deep learning model?

What is StandardScaler() in ML and why it is used?

How to perform data augmentation for deep learning using Keras?

Upload and display images on the frontend using Flask in Python.

That's all for now

Keep Coding

A Guide to Use Sessions in Flask App for Storing Data Temporarily

Sachin Pal — Sat, 09 Sep 2023 12:08:29 GMT

In this article, you'll see what are sessions and how to utilize them in a Flask application to store information.

What are Sessions?

In general, a session is an active period of interaction between the user and the application. The entirety of the session is the time the user spends on an application from logging in to logging out.

Sessions can store and manage data across multiple requests. Sessions are particularly useful for managing user-related data and maintaining it between different interactions of a web application.

For instance, you can store the authentication status (whether the user is logged in or not) of the user on the server when the user logs in. Storing this information in a session allows the server to remember that the user is authenticated even as they navigate through different parts of the web application.

Using Sessions in a Flask App

To use sessions to store data on the server using the Flask app, you can use the flask module's session.

Importing Required Modules

# Importing required modulesfrom flask import sessionfrom flask import Flask, render_template, request, redirect, url_forfrom datetime import timedelta

The session is imported from flask which will be used to access and manipulate session data in a Flask application.

The other functions and classes are imported from flask which will help in creating a Flask application and handling routes.

The timedelta class is imported from datetime module which will help in calculating time intervals.

Setting Up Flask App

# Creating Flask Appapp = Flask(__name__)# Setting up Secret Key for Session Managementapp.secret_key = "MY_SECRET_KEY"

The Flask application is created by instantiating the Flask(__name__) and storing the instance in the app variable.

The secret key is set to "MY_SECRET_KEY" by calling the app object's secret_key attribute. This will assist with session management.

Setting Up Session Timeout

# Setting Lifetime of Sessionsapp.permanent_session_lifetime = timedelta(minutes=1)

The session timeout is set to one minute (timedelta(minutes=1)) by calling the app object's permanent_session_lifetime. After one minute, the active sessions will be terminated automatically.

Routes and View Functions

In this section, you will create routes and view functions for registering users, displaying relevant messages for the request, and storing them on the server via sessions. You'll then create a route and view function to remove the username from the session.

Home Route and View Function

# Home Route@app.route("/")def home():    return render_template("home.html")

This simple code defines the route ("/") and a view function called home that displays the home.html template.

Route and View Function for Registering Username

# Route for Registering Username@app.route("/add", methods=["GET", "POST"])def add_username():    session['message'] = "Enter your username to continue."    if request.method == 'POST':        username = request.form['username']        session['user'] = username        session['greet'] = f"Successfully registered username - {session['user']}."        return redirect(url_for("home"))    return render_template("add_username.html")

The "/add" route, which can handle both GET and POST requests, is defined in this code snippet.

The view function add_username() is defined, and it is in charge of displaying the add_username.html template. A message is stored in the session using session['message'] within the view function body.

The function then determines whether the request is a POST request and retrieves the value of "username" from the form within that conditional block. Using session['user'] = username, the retrieved username is saved in the session under the key 'user'.

After that, a message is saved in the session under the key 'greet' to notify users of their successful username registration, and the user is redirected to the home route.

Route and View Function for Removing Username

# Route for Removing Username@app.route("/remove")def remove_username():    session.pop('user')    session['notify'] = "Username Removed from Session Storage."    return redirect(url_for("home"))

The code defines a "/remove" route and a remove_username view function.

Using session.pop('user'), the view function removes the username from the session. Following that, a message is saved under the key 'notify' to notify users that the username has been removed from the session and then they are redirected to the home page.

Creating Templates

You need to generate three HTML templates named base.html, home.html, and add_username.html within the templates directory.

base.html

This template includes an HTML skeleton as well as Bootstrap CSS and JavaScript.

html><html lang="en"><head>    <meta charset="utf-8">    <meta content="width=device-width, initial-scale=1" name="viewport">    <link crossorigin="anonymous" href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css"          integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3" rel="stylesheet">    <title>Session in Flasktitle>head><body>{% block content %} {% endblock %}<script crossorigin="anonymous"        integrity="sha384-ka7Sk0Gln4gmtz2MlQnikT1wXgYsOg+OMhuP+IlRH9sENBO0LRn5q+8nbTov4+1p"        src="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js">script>body>html>

home.html

{% extends "base.html" %}{% block content %}{% if not session['user'] %}<div class="alert alert-success alert-dismissible fade show" role="alert">    {{ session['notify'] }}    <button aria-label="Close" class="btn-close" data-bs-dismiss="alert" type="button">button>div>{% else %}<div class="alert alert-success alert-dismissible fade show" role="alert">    {{ session['greet'] }}    <button aria-label="Close" class="btn-close" data-bs-dismiss="alert" type="button">button>div>{% endif %}<div class="container">    <h1>Welcome To GeekPython!h1>    {% if session['user'] %}    <h3>Hi🖐, {{ session['user'] }}!h3>    <div class="my-3">        <a class="btn btn-info mb-3" href="{{ url_for('remove_username') }}">Log outa>    div>    {% else %}    <h3>You Need to Register First!h3>    <a class="btn btn-info mb-3" href="{{ url_for('add_username') }}">Registera>    {% endif %}div>{% endblock %}

In the above code snippet, the {% if not session['user'] %} block checks if there is no username and displays the message stored in the session under the key 'notify' otherwise displays the message stored in the session['greet'].

The {% if session['user'] %} block checks if there is a username, it displays the username with a "Log out" button otherwise displays the "Register" button to register the username.

add_username.html

{% extends "base.html" %}{% block content %}<div class="alert alert-success alert-dismissible fade show" role="alert">    {{ session['message'] }}    <button aria-label="Close" class="btn-close" data-bs-dismiss="alert" type="button">button>div><div class="container">    <h1>Enter Your Username👇h1>    <form action="/add" method="post">        <div class="mb-3">            <label class="form-label" for="name">Usernamelabel>            <input class="form-control" id="name" name="username" type="text">        div>        <button class="btn btn-info" type="submit">Submitbutton>    form>div>{% endblock %}

This template includes a form with a username field and a "Submit" button to submit the username and save it in the session. At first, a message is displayed on the request to the "/add" URL, which is saved in the session under the key 'message'.

Demonstration

Here are some screenshots of testing the routes.

Default or Home page preview.

Adding username and submitting the form.

Username registered

The username was removed from the session

How to use Flask-Session

Flask-Session is an extension of Flask that offers extra support for managing sessions in the Flask application. It is designed to enhance the capability of session handling by providing various session storage types and configurations in Flask.

Installation

You need to install the package before implementing it in the Flask application. You can install it by running the following command using pip (Python Package Manager).

pip install Flask-Session

Implementation

# Importing required modulesfrom flask import sessionfrom flask_session import Sessionfrom flask import Flask, render_template, request, redirect, url_for# Creating Flask Appapp = Flask(__name__)# Setting up Secret Key for Session Managementapp.secret_key = "MY_SECRET_KEY"# Configuring Sessionapp.config['PERMANENT_SESSION_LIFETIME'] = 60  # Session Lifetimeapp.config['SESSION_TYPE'] = "filesystem"  # Session Storage Type# Path to Storing Sessionapp.config['SESSION_FILE_DIR'] = "session_data"# Initializing the Session ExtensionSession(app)# Remaining Code

The code from earlier is modified to use the Flask-Session. The changes to the code are listed as follows:

from flask_session import Session: The Session class is imported from the flask_session module.
app.config['PERMANENT_SESSION_LIFETIME']: The PERMANENT_SESSION_LIFETIME configuration is used to set the expiration time of the active session.
app.config['SESSION_TYPE']: This is used to set the session storage type. In this case, the storage type is set to "filesystem" which means that the session data will be stored in the server's filesystem.
app.config['SESSION_FILE_DIR']: This is used to set the path to the directory where the session data will be stored.
Session(app): The Flask-Session is initialized with the Flask application (app). This will connect the Flask-Session with the app and all the configurations will be applied.

Most Commonly Used Configuration Keys

The following are the most commonly used configuration keys provided by Flask-Session.

Configuration Key	Description
SESSION_TYPE	Specifies which type of session interface to use. Built-in session types:
	- null: NullSessionInterface (default)
	- redis: RedisSessionInterface
	- memcached: MemcachedSessionInterface
	- mongodb: MongoDBSessionInterface
	- sqlalchemy: SqlAlchemySessionInterface
SESSION_PERMANENT	Whether use permanent session or not, default to be `True`
PERMANENT_SESSION_LIFETIME	The lifetime of a permanent session in seconds
SESSION_FILE_DIR	The directory where session files are stored. Default to use the flask_session directory under the current working directory.

Flask-Session provides additional configuration keys that eventually enhance the capabilities of the session provided by Flask. You can refer to the official docs for more.

Conclusion

The session is the duration of a user's interaction with an application, beginning with logging in and ending when the user logs out. During this time, the application is able to store and manage user-specific data across multiple requests.

Flask session can be used to store and manage data on the server, and Flask-Session can be used to extend the flask session's capability.

Let's recall what you've learned:

What is a session?
How to use session in Flask by creating a Flask app and storing user-related data in the session.
How to use Flask-Session to add additional application configurations such as session storage type and directory.

🏆Other articles you might be interested in if you liked this one

How to display messages using flash() in Flask app?

How to structure Flask app using Blueprint?

Upload and display images on the frontend using Flask in Python.

How to unit testing of the code in Python using unittest module?

How to connect the SQLite database with the Flask app using Python?

How to implement __getitem__, __setitem__ and __delitem__ in Python?

That's all for now

Keep Coding

Nested for Loops in Python - How to Use Them

Sachin Pal — Mon, 28 Aug 2023 15:20:58 GMT

You may have used Python for loops to iterate through iterable objects in order to access them independently and perform various operations. You're probably familiar with the concept of nested for loops as well.

In Python, nested for loops are loops that have one or more for loops within them.

Nested for loops have a structure that is similar to the following code:

for x in my_iterable:    for item in x:        print(item.upper())

You may have seen the use of nested for loops in the creation of various patterns of asterisk or hyphen symbols.

Working of Nested for Loops

A nested for loop is made up of both an outer loop and an inner loop. When observing the structure of the nested for loop mentioned earlier, you'll observe that the initial part of the code represents an outer for loop, and within this outer for loop, there exists an inner for loop.

for anything in article: # Outer for-loop    # Some code here    for everything in anything: # Inner for-loop        # Some code here

But how they go hand in hand and iterate through data. Let's uncover the iteration process of nested for loops.

my_iterable = ["Sachin", "Rishu", "Yashwant"]for item in my_iterable:    for each_elem in item:        print(each_elem, end=" ")

The outer loop accesses each item in my_iterable.

For each item in the outer loop (e.g., "Sachin"), the inner loop iterates through each individual element in the item (e.g., "S", "a", "c", "h", "i", "n").

The print statement prints each character with a space, producing output like "S a c h i n ".

S a c h i n R i s h u Y a s h w a n t

Example of Nested for Loops

names = ["Sachin", "Rishu", "Yashwant"]games = ["Cricket", "Snooker"]for item in names: # Outer loop    for game in games: # Inner loop        print(item, "plays", game)

The outer loop iterates through the names list (Sachin, Rishu, Yashwant).

For each name in the outer loop, the inner loop iterates through the games list (Cricket, Snooker).

The print statement combines the current name with each game to output sentences like "Sachin plays Cricket" and "Sachin plays Snooker".

Sachin plays CricketSachin plays SnookerRishu plays CricketRishu plays SnookerYashwant plays CricketYashwant plays Snooker

Nested for Loops with range() Function

for i in range(1, 2):    for x in range(3):        for num in range(x):            print(num)

The above code might be a bit confusing, let's break it down into pieces:

The outer for loop (for i in range(1, 2)) executes once because the range is from 1 to 2 (not inclusive), resulting in a single iteration. This loop governs the number of times the inner loops are executed.

The middle or inner for loop runs three times because of the range from 0 to 2. The iterations produce the values 0, 1, and 2.

The innermost for loop's behavior depends on the middle for loop because its range relies on the current value of the x variable.

When the value of x is 0 (first iteration), the innermost loop won't execute because the range is empty. When the value of x is 1 (second iteration), it runs for one time returning the value 0 due to the range(1), and then finally when x is 2 (third iteration), it runs two times returning the values 0 and 1 due to the range(2).

Application of Nested for Loops in Real World

There are various ways to use nested for loops. They come in handy when you need to perform multiple rounds of iterations, deal with layered data structures like lists inside dictionaries or lists within other lists for navigating through data, or when you want to process and analyze text by breaking it down into smaller parts.

Example - Tokenizing Sentence into Words and Characters

text = "Hey, there! Welcome to GeekPython."sentences = text.split(". ")   # Splitting the text into sentencesprint("Tokenized Sentences into Words and Characters:")for sentence in sentences:    print("- Sentence:", sentence)    words = sentence.split(" ")  # Splitting sentence into words    for word in words:        print("  - Word:", word)        for character in word:            print("    - Character:", character)

The above code tokenizes (breaking sentences into small units) the text data into words and characters using the nested for loops. The code goes as follows:

The text data is initially split into sentences using text.split(". "), where the period followed by a space is used as the delimiter.

The outer for loop iterates through sentences and prints them. The current sentence is then split into words using sentence.split(" ").

Then subsequent for loop iterates through the words of the current sentence and prints them.

Additionally, a further nested for loop is nested within the word loop, which processes the characters within each word, one at a time.

The inner for loops are executed for every cycle of the outer for loop. This signifies that the code's process is repeated for each sentence. However, since the provided code only deals with one sentence, it executes just once.

Tokenized Sentences into Words and Characters:- Sentence: Hey, there! Welcome to GeekPython.  - Word: Hey,    - Character: H    - Character: e    - Character: y    - Character: ,  - Word: there!    - Character: t    - Character: h    - Character: e    - Character: r    - Character: e    - Character: !  - Word: Welcome    - Character: W    - Character: e    - Character: l    - Character: c    - Character: o    - Character: m    - Character: e  - Word: to    - Character: t    - Character: o  - Word: GeekPython.    - Character: G    - Character: e    - Character: e    - Character: k    - Character: P    - Character: y    - Character: t    - Character: h    - Character: o    - Character: n    - Character: .

You could have also performed the tokenization of textual data by reading the data from the file and to do so, you just need to adjust your code as shown below:

with open("test.txt") as file:    content = file.read()    sentences = content.split(". ")    print("Tokenized sentences into words and characters:")    for sentence in sentences:        print("- Sentence:", sentence)        words = sentence.split(" ")        for word in words:            print("  - Word:", word)            for char in word:                print("    - Character:", char)

Conclusion

Python's for loops are employed to sequentially go through a series of data elements, allowing access to each of them individually. Furthermore, a single for loop can contain one or more additional for loops within it, a concept commonly known as nested for loops.

In the context of nested for loops, during every iteration of the outer for loop, the inner for loop iterates through each item present in the respective iterable. To illustrate, consider the scenario of shopping: envision visiting various shops and inspecting the items they offer. You start by exploring the first shop, examining all its items, and then proceed to the next shop, repeating this process until you have surveyed all available shops.

Let's recall what you've learned:

What are nested for loops in Python
How nested for loops work
Code examples of nested for loops

🏆Other articles you might be interested in if you liked this one

How to display messages using flash() in Flask app?

How to structure Flask app using Blueprint?

Upload and display images on the frontend using Flask in Python.

Change string representation of the objects using __str__ and __repr__.

How to connect the SQLite database with the Flask app using Python?

How to implement __getitem__, __setitem__ and __delitem__ in Python?

That's all for now

Keep Coding

Flashing Messages using the flash() Function in Flask

Sachin Pal — Wed, 23 Aug 2023 06:39:56 GMT

Introduction

The Flask flash() function is an efficient way to display temporary messages to the user. This can be used to display a variety of messages, including error, notification, warning, and status messages.

By the end of this article, you'll be able to learn:

How to use the flash() function
Flashing messages on the frontend
Flashing messages with categories
Filtering flash messages based on categories
Best practices for effectively using flashed messages

Prerequisites

Before you dive into this tutorial on flashing messages with Flask, make sure you have a fundamental understanding of the following concepts:

Familiarity with Flask, including route handling, templates, and basic application structure.
A basic understanding of HTML markup is necessary, as you'll be working with HTML templates to render the flashed messages on the frontend.
Understanding of Jinja2 templating engine, which is integrated with Flask for rendering dynamic content within HTML templates.

How to Use flash() Method

As previously mentioned, the flash() function is used to display messages to the users after the specific request is made. Its primary goal is to offer relevant feedback, which enhances the user experience.

The flash() function accepts two parameters:

message: The message to display to the user.
category: Specifies the message category. This is an optional parameter.

Using flash() within Flask App

This section describes how to integrate the message flashing functionality into the Flask application and display it on the frontend.

Importing flash from flask and other required modules

Create an app.py file and import the flash from flask, along with other necessary modules, into the file.

from flask import Flask, render_template, request, url_for, redirectfrom flask import flash

You could also import these modules in a single line as well.

Flask App Setup

app = Flask(__name__)app.secret_key = "Sachin"

An instance of Flask is created and stored within the app variable to initialize the Flask application.

The Flask application's secret key is assigned as "Sachin" using the secret_key attribute of the app object. This key will serve for session management.

Note: It's essential to store the secret key in a secure environment, rather than hardcoding it directly within the application.

Implementing flash() Function within View Functions

@app.route("/info", methods=["GET", "POST"])def add_info():    if request.method == "POST":        name = request.form['name']        profession = request.form['profession']        if name == "" or profession == "":            flash("Invalid: Every field is required.")            return redirect(url_for("add_info"))        else:            flash("Success: Info added successfully.")            return redirect(url_for("add_info"))    return render_template("info.html")

The route "/info" is created to manage both GET and POST requests, and it is linked to the add_info() view function.

Within the add_info() view function, it checks whether the request is a POST request, and then retrieves the values from the "Name" and "Profession" form fields.

Following that, it examines if any of the fields are empty. If they are, a message "Invalid: Every field is required." is displayed using the flash() function, and the route remains the same. Conversely, if none of the fields are empty, a message "Success: Info added successfully." is flashed using the flash() function, and the route continues for further information input.

However, this won't immediately display the message on the frontend. To achieve that, you must use the get_flashed_messages() function within the template.

Implementing get_flashed_messages() Function within Message Template

Create a file named message.html within the templates directory in your project.

{% with msg = get_flashed_messages() %}{% if msg %}{% for message in msg %}<div class="alert alert-warning alert-dismissible fade show" role="alert">  <strong>{{ message }}strong>  <button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close">button>div>{% endfor %}{% endif %}{% endwith %}

The get_flashed_messages() function retrieves the flashed messages and preserves them in the session. The function is invoked using the with statement and its output is stored in the msg variable.

The {% if msg %} statement evaluates whether any flashed messages are stored in the msg variable. If such messages exist, the {% for message in msg %} block iterates through each message and renders them using {{ message }} within the Bootstrap alert component.

Ultimately, the loop terminated using {% endfor %}, the conditional block is terminated using {% endif %}, and the with statement block is terminated with {% endwith %}.

Now you can include the message template in any of your templates to display the messages.

Preparing the Frontend

Create two HTML files in the templates folder with the names base.html and info.html.

base.html

This file will contain the basic HTML layout with Bootstrap CSS and JavaScript.

html><html lang="en"><head>    <meta charset="utf-8">    <meta content="width=device-width, initial-scale=1" name="viewport">    <link crossorigin="anonymous" href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css"          integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3" rel="stylesheet">    <title>Flash Message Using Flasktitle>head><body>{% block content %} {% endblock %}<script crossorigin="anonymous"        integrity="sha384-ka7Sk0Gln4gmtz2MlQnikT1wXgYsOg+OMhuP+IlRH9sENBO0LRn5q+8nbTov4+1p"        src="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js">script>body>html>

info.html

This file will hold a form for inputting information and includes the message.html template to display flashed messages.

{% extends 'base.html' %}{% block content %}{% include 'message.html' %}<div class="container my-5">    <form action="/info" method="post">        <div class="mb-3">            <label class="form-label" for="name">Namelabel>            <input class="form-control" id="name" name="name" type="text">        div>        <div class="mb-3">            <label class="form-label" for="profession">Professionlabel>            <input class="form-control" id="profession" name="profession" type="text">        div>        <button class="btn btn-dark" type="submit">Submitbutton>    form>div>{% endblock %}

Running the Flask Application

When you execute the app.py file and evaluate the frontend, you'll observe flashed messages upon making requests.

Here's a preview of the application hosted on the 127.0.0.1:5000 server using the "/info" route.

When submitting the form without completing the "Profession" field, the application displays the relevant flashed message.

When the form is submitted with all fields completed, the relevant message is displayed.

Flashing Messages with Categories

Each message serves a unique purpose. Some aim to caution users, while others indicate successful task completion. Assigning categories to the messages makes them distinct from one another and provides a better user experience.

The get_flashed_messages() function offers a with_categories parameter that can be set to True. Doing so enables you to utilize the category assigned to the message in the flash() function.

Assigning Categories to the Messages

To assign categories, make use of the flash() function. Adjust your add_info() view function in the app.py file as demonstrated in the following code:

@app.route("/info", methods=["GET", "POST"])def add_info():    if request.method == "POST":        name = request.form['name']        profession = request.form['profession']        if name == "" or profession == "":            flash("Invalid: Every field is required.", "danger")            return redirect(url_for("add_info"))        else:            flash("Success: Info added successfully.", "success")            return redirect(url_for("add_info"))    return render_template("info.html")

The categories danger and success are assigned to both messages. These categories correspond to Bootstrap CSS classes. Specifically, "danger" signifies errors, while "success" signifies successful completion.

Retrieving Categories and Applying Styles

To utilize categories from flashed messages in templates, retrieve them using get_flashed_messages(with_category=True) within the template. Then, iterate through them in a similar manner as you did with regular messages.

Adjust your message.html template by implementing the provided code as provided below:

{% with msg = get_flashed_messages(with_categories=True) %}{% if msg %}{% for category, message in msg %}<div class="alert alert-{{ category }} alert-dismissible fade show" role="alert">  <strong>{{ message }}strong>  <button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close">button>div>{% endfor %}{% endif %}{% endwith %}

The categories are retrieved and utilized to apply specific CSS properties to the Bootstrap alert component, tailored to the respective request.

With these changes in place, run the application again. You'll observe messages being highlighted in varying colors that correspond to the specific request.

The illustration below depicts how the message's appearance differs between invalid and successful form submissions, respectively.

Another Use Case of Categories

Categories could also serve as prefixes to messages, offering another practical application. Insert the provided attribute inside the

block of the Bootstrap alert component.

<strong>{{ category }} - {{ message }}strong>

The provided code will add the corresponding category as a prefix to the actual message. This will result in an appearance similar to what's displayed in the image below.

Filtering Flash Messages Based on Categories

To filter messages based on categories, you can utilize the category_filter parameter within the get_flashed_messages() function in the template. This parameter accepts a list of message categories.

Create a new file named filtered_message.html within the templates folder. Then, insert the provided code snippet into this file.

{% with error_msg = get_flashed_messages(category_filter=["danger"]) %}{% if error_msg %}{% for message in error_msg %}<div class="alert alert-danger alert-dismissible fade show" role="alert">  <strong>{{ message }}strong>  <button type="button" class="btn-close" data-bs-dismiss="alert" aria-label="Close">button>div>{% endfor %}{% endif %}{% endwith %}

The code snippet mentioned above retrieves and filters messages based on the "danger" category using get_flashed_messages(category_filter=["danger"]). If there are flashed messages matching this category, the code iterates through them and displays them when the relevant request is made.

To display only messages with the "danger" category on the frontend, include the filtered_message.html template within info.html.

Best Practices for Using Flashed Messages in Flask

Flashed messages are highly valuable for enhancing the user experience through prompt feedback. To maximize their impact, it's important to implement the following practices.

Provide clear and concise messages, and avoid displaying messages on every request to prevent overwhelming users.
When presenting error messages, include instructions on how users can address the issue.
Assign suitable categories to messages that accurately reflect their content and purpose.
As flashed messages are temporary, consider adding a dismissal button to allow users to remove them once they are no longer needed.
Refrain from revealing any sensitive details in messages, as this could potentially compromise the security of your application.

Conclusion

Flashing messages on an application helps to notify users about some action performed, be it task completion, error, or warning. It enhances the user experience but only if it is used keeping in mind the best practices.

In this article, the secret key is hardcoded in the application, which is not a good practice. Store sensitive details like secret keys within the protected environment and refrain from revealing them in messages.

🏆Other articles you might be interested in if you liked this one

How to structure Flask app using Blueprint?

Upload and display images on the frontend using Flask in Python.

Building a Flask image recognition webapp using a deep learning model.

How to connect the SQLite database with Flask app using Python?

How to create a database in Appwrite using Python?

How to Integrate TailwindCSS with Flask?

That's all for now

Keep Coding

How to Create and Register Blueprint in Flask

Sachin Pal — Sat, 19 Aug 2023 07:00:40 GMT

Introduction

Large applications can become complex and difficult to manage due to the presence of numerous components and intricate structures.

Flask blueprints help in organizing large applications into smaller, manageable components, leading to enhanced maintainability of the application.

Blueprints can contain views, templates, and static files for various components, similar to the structure of a typical Flask application. These blueprints can be registered with the Flask app to integrate them into the application.

Flask App Structure

If you've used the Flask framework before, you might structure your Flask application as follows:

. app.py static/ templates/     index.html

After you've created the structure for your app, you'll add some code to the app.py file.

# app.pyfrom flask import Flask, render_templateapp = Flask(__name__)@app.route("/")def home():    return render_template("index.html")

In smaller projects, defining views within the main file may not pose a significant problem. However, in more complex projects, when you need to create routes for various components such as user management, admin functions, profiles, voting, and more, organizing these routes within a single file can become challenging. This can lead to poor code maintainability and make the project harder to manage.

This is precisely where blueprints prove valuable, as they allow you to structure your application into smaller components, enhancing maintainability.

Creating the Blueprint

Add the following code to a new Python file at the root level, blueprint.py.

# blueprint.pyfrom flask import Blueprintbp = Blueprint("blueprint", __name__)@bp.route("/")def home():    return "Hello
"@bp.route("/user")def user_info():    return "User Info
"

The code imports the Blueprint class from the flask which will help in defining the routes.

The instance of the Blueprint is created by calling Blueprint("blueprint", __name__) and passed in two arguments, first, "blueprint", is the Blueprint's name, and the second, __name__, is the Blueprint's import name. After that, the instance is saved in the bp variable.

The routes are defined using the Blueprint instance (bp) in a similar manner to how routes are defined using the Flask application instance (app).

The @bp.route() decorators are used to associate URL routes with the view functions defined within the blueprint. The home() function is associated with the root URL ("/"), and the user_info() function is associated with the "/user" URL.

Registering the Blueprint

A blueprint is similar to a Flask app, but it is not an app, instead, the blueprint must be registered within the Flask app to extend its functionality.

Navigate to the main Flask app created in the app.py Python file and register the above-created blueprint.

# app.pyfrom flask import Flaskfrom blueprint import bp# Flask app instanceapp = Flask(__name__)app.register_blueprint(bp)if __name__ == "__main__":    app.run(debug=True)

The code imports the Blueprint instance, bp, from the blueprint module (blueprint.py), and this import includes all of the blueprint's routes and views.

The register_blueprint() method is then used to register the blueprint instance (bp) with the Flask app instance (app).

If you run the app, you can access the routes defined within the blueprint.

Mounting Blueprints at Different Locations

Blueprints can be attached to a certain URL path that can be prefixed with all of the routes defined within the blueprint.

You can make this happen by using the url_prefix parameter while registering the blueprint using the register_blueprint() method in the Flask app.

# app.pyfrom flask import Flaskfrom blueprint import bp# Flask app instanceapp = Flask(__name__)app.register_blueprint(bp, url_prefix="/demo")if __name__ == "__main__":    app.run(debug=True)

The url_prefix is now "/demo". This means that the routes within the blueprint ("/" and "/user") can be accessed by adding "/demo" at the beginning of their URL paths.

The complete URL of the ("/") route is changed to "/demo/" for the home() function. Also, the complete URL of the "/user" route will now be "/demo/user" for the user_info() function.

You can set the url_prefix while making the Blueprint instance. The Blueprint class offers a url_prefix parameter, and the provided code example demonstrates its usage.

# blueprint.pyfrom flask import Blueprintbp = Blueprint("blueprint", __name__, url_prefix="/sample")

The URL path for the ("/") route will become "/sample/". Likewise, the URL path for the "/user" route will change to "/sample/user".

Note: If you set the url_prefix parameter inside the blueprint, avoid setting it again during the blueprint registration. Doing so will overwrite the blueprint's URL prefix.

Caution:
If you set url_prefix in the blueprint and then set it again while registering the blueprint within the app, the latter will overwrite the former.

Templates and Static Folders

You have several options for organizing your app using blueprints.

. app/     __init__.py     admin/        __init__.py        routes.py        static/        templates/     user/        __init__.py        routes.py        static/        templates/     models.py

If your project's structure aligns with the example above, you'll have to indicate the locations of templates and static folders within your Blueprint.

The Blueprint class gives you parameters, templates_folder and static_folder, which allow you to define the exact path (either absolute or relative) to the blueprint's templates and static folder when creating an instance of the Blueprint class.

# admin/routes.pyfrom flask import Blueprintadmin_bp = Blueprint("admin_blueprint",                     __name__,                     template_folder="templates",                     static_folder="static")

# user/routes.pyfrom flask import Blueprintuser_bp = Blueprint("user_blueprint",                    __name__,                    template_folder="templates",                    static_folder="static")

Both "admin_bp" and "user_bp" blueprints have their own directories for templates and static files. This separation ensures that their respective templates and assets are kept separate from other parts of the app, maintaining isolation and organization.

Avoid Template Name Clashes

When you're designing blueprints for different parts of your application, the arrangement of your project holds significance. For instance, referring to the layout mentioned above, duplicating HTML filenames in the admin/templates and user/templates directories can lead to naming clashes.

The Flask application searches for templates in the "templates" directory. If there are duplicate template file paths across different blueprints, the one that takes precedence depends on the order of blueprint registration. The one registered later will override the earlier one.

To avoid potential issues, you can shape the project layout in the following manner:

. app/     app.py     models.py     admin/        __init__.py        routes.py        static/        templates/            admin/                index.html     user/         __init__.py         routes.py         static/         templates/             user/                 index.html

Alternatively, you can assign distinct names to the templates.

Template Routing with Blueprints

Template routing with blueprints is distinct from the conventional approach. It involves a specific format where the blueprint name is added as a prefix to the associated view function.

As an example, if your blueprint is named admin_blueprint and includes a view function named home(), then the format becomes admin_blueprint.home.

admin/templates/admin/index.html

<a href="{{ url_for('user_blueprint.home') }}">Usera>

The link mentioned above points to the route connected with the home() function in the user_blueprint. The url_for() function dynamically produces a URL for the user_blueprint.home route.

user/templates/user/index.html

<a href="{{ url_for('admin_blueprint.home') }}">Admina>

Likewise, the situation is identical to the link mentioned earlier. It leads to the route linked with the home function in the admin_blueprint. The url_for() function dynamically generates a URL for the admin_blueprint.home route.

Including CSS Files with Blueprints: Creating URLs for Static Assets

The procedure is quite similar to what you did with templates. To include or provide CSS files within the HTML template, you should construct a URL that directs to the CSS file situated in the specified static folder of the blueprint.

As an example, the method to link the "style.css" file found in the static directory of the "admin_blueprint" blueprint would be url_for('admin_blueprint.static', filename='style.css').

Note: You must indicate the directory path where static files (CSS, JavaScript, images, etc.) are situated for this blueprint. This is achieved using the static_folder parameter.

admin/templates/admin/index.html

<link rel="stylesheet" href="{{ url_for('admin_blueprint.static', filename='style.css') }}">

This HTML code snippet above uses the url_for() function to generate a URL that points to the "style.css" static file linked to the admin_blueprint.

Custom URL Path for Static Files

Flask Blueprint provides a static_url_path parameter that provides flexibility to define a custom URL prefix for the static files (CSS, JavaScript, images, etc.) associated with the Blueprint.

# admin/routes.pyfrom flask import Blueprintadmin_bp = Blueprint("admin_blueprint",                     __name__,                     template_folder="templates",                     static_folder='static',                     static_url_path='admin')

The static_url_path is established as "admin", effectively making the static files within the admin_blueprint accessible under the admin/ URL path.

Now you can directly include static files (CSS, JavaScript, images, etc) by specifying the static URL path.

Including Static Files using Static URL Path

<link rel="stylesheet" href="admin/style.css"><img src="admin/partners.png">

The complete webpage would look like the image shown below:

Conclusion

A Flask Blueprint is used as an extension for a Flask app, and it serves the purpose of organizing large and complex applications into smaller, more manageable components.

Let's recall what you've seen in this tutorial:

What is Blueprint in Flask
Creating and Registering a Blueprint
Template routing with Blueprint
Including static files with Blueprint
Custom URL path for static assets

🏆Other articles you might be interested in if you liked this one

Upload and display images on the frontend using Flask in Python.

Building a Flask image recognition webapp using a deep learning model.

How to connect the SQLite database with Flask app using Python?

How to create a database in Appwrite using Python?

How to Integrate TailwindCSS with Flask?

What is context manager and how to use them using with statement?

That's all for now

Keep Coding

How to Create and Connect an SQLite Database with Flask App using Python

Sachin Pal — Sun, 06 Aug 2023 16:34:06 GMT

This article will guide you step by step in making a database using Flask-SQLAlchemy. It will show you how to work with an SQLite database in your Flask app, and then how to make a form on the website to collect user information and put it into the database.

Installing Flask-SQLAlchemy Lib

Flask-SQLAlchemy uses SQLAlchemy, a powerful ORM (object-relational mapping) library for Python, allowing interaction with databases using Python.

Open the terminal window and install the library using pip in your project environment.

pip install Flask-SQLAlchemy

Creating SQLAlchemy Instance

Create a database.py file and add the following code inside the file to create an instance of SQLAlchemy.

""" database.py file """from flask_sqlalchemy import SQLAlchemy""" SQLAlchemy Instance """db = SQLAlchemy()

This instance will be used to create the database and manage database operations.

Creating Models

SQLAlchemy provides an object-oriented approach to creating database tables and columns, defining relationships, and setting constraints using Python classes.

Models let you define a database table with the fields you need in your database with constraints and relationships with other tables and fields by using only Python code.

""" models.py file"""# SQLAlchemy Instance Is Importedfrom database import db# Declaring Modelclass Vehicle(db.Model):    __tablename__ = "vehicle"    id = db.Column(db.Integer, primary_key=True)    name = db.Column(db.String(150), unique=True, nullable=False)    price = db.Column(db.Integer, nullable=False)    created_at = db.Column(db.DateTime(timezone=True), server_default=db.func.current_timestamp())

Inside the models.py file, the SQLAlchemy instance (db) is imported to facilitate interactions with the SQLite database.

A class named Vehicle is declared, which inherits from db.Model, effectively making it an SQLAlchemy model. Inside this class, the table name and structure are specified with the following details and fields:

__tablename__: This attribute sets the table name in the database, which, in this case, is "vehicle".
id: The id column serves as a primary key with an Integer data type, ensuring that each row has a unique id.
name: The name column, represented by a String type, will be used to store the vehicle's name with a maximum length of 150 characters. The unique constraint is enabled to enforce uniqueness, and nullable=False ensures that the title must be provided for each entry.
price: The price column, of String type, will be used to store the vehicle's price, and it cannot be left empty.
created_at: This column, with a Datetime type, captures the timestamp when an entry is created. The current timestamp will be automatically generated using current_timestamp() from the func module in SQLAlchemy.

Setup

Flask-SQLAlchemy can work with a variety of databases, but by default, if you don't specify a database URI in your Flask app configuration, it will use SQLite, a lightweight and simple database suitable for small-scale or demo projects.

from flask import Flaskfrom database import db# Creating Flask Appapp = Flask(__name__)# Database Namedb_name = 'vehicle.db'# Configuring SQLite Database URIapp.config['SQLALCHEMY_DATABASE_URI'] = 'sqlite:///' + db_name# Suppresses warning while tracking modificationsapp.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False# Initialising SQLAlchemy with Flask Appdb.init_app(app)

First, the code imports the Flask class from the flask module, enabling app creation, and the SQLAlchemy instance from database, facilitating database management.

The Flask app instance is created using Flask(__name__), and the instance is stored in the variable app. The use of __name__ is to determine the root path of the module.

The database name, vehicle.db, is defined and stored in the variable db_name. The SQLALCHEMY_DATABASE_URI config variable is then set, which contains the SQLite database connection string 'sqlite:///' + db_name.

For other database engines, the connection URL formats differ. For example, for MySQL and PostgreSQL, the URL formats are provided as follows.
mysql://username:password@host:port/database_name
postgresql://username:password@host:port/database_name

The SQLALCHEMY_TRACK_MODIFICATIONS config variable is set to False, disabling modification tracking of objects and suppressing warnings.

Finally, an instance of SQLAlchemy is initialized and binds with the Flask app by calling db.init_app(app).

Creating Database

There are two approaches, you can perform either one to create the database and a table with the defined fields.

Approach 1 - Manually Pushing the App Context

""" app.py file """# Previous Code Here""" Creating Database with App Context"""def create_db():    with app.app_context():        db.create_all()if __name__ == "__main__":    from models import Vehicle    create_db()

The code defines a function named create_db(), which utilizes the app_context() of the Flask app instance (app) with a with statement.

In the app context, the create_all() method of the db object, provided by SQLAlchemy, is invoked to create all the tables specified in the models module.

In the if __name__ == "__main__" block, the Vehicle model is imported and create_db() function is called to generate the table.

To execute the app, either run it from your Integrated Development Environment (IDE) or use the terminal by entering the following command.

python app.py

Successful execution will result in the creation of a "vehicle.db" database within the instance directory at the root level.

Approach 2 - Flask Shell

When using the Flask shell, the commands are executed within the context of the Flask app. Below are the CLI commands to run in your terminal.

D:\SACHIN\Pycharm\FlaskxDB>flask shellPython 3.10.5 (tags/v3.10.5:f377153, Jun  6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32App: appInstance: D:\SACHIN\Pycharm\FlaskxDB\instance>>> from database import db>>> from models import Vehicle>>> db.create_all()>>> exit()

The SQLAlchemy instance, db and the model, Vehicle are imported from the database and models module respectively. Then, the db.create_all() command is executed to create a table based on the defined model. Finally, you can exit the shell using the exit() command.

The database vehicle.db would be created in the instance directory at the project's root directory.

Creating Frontend

To create a frontend and collect data from users via form, the following HTML files inside the templates folder need to be created.

root_dir(project)|_templates/|    |_base.html|    |_home.html|    |_vehicle.html||_app.py|_database.py|_models.py

The layout and Bootstrap CSS and JS will be stored in base.html, the vehicle data from the database will be displayed in home.html, and a form for adding vehicle data will be stored in vehicle.html.

base.html

html><html lang="en"><head>    <meta charset="UTF-8">    <title>Vehicletitle>    <link href="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/css/bootstrap.min.css" rel="stylesheet" integrity="sha384-1BmE4kWBq78iYhFldvKuhfTAU6auU8tT94WrHftjDbrCEXSU1oBoqyl2QvZ6jIW3" crossorigin="anonymous">head><body>{% block body %} {% endblock %}<script src="https://cdn.jsdelivr.net/npm/bootstrap@5.1.3/dist/js/bootstrap.bundle.min.js" integrity="sha384-ka7Sk0Gln4gmtz2MlQnikT1wXgYsOg+OMhuP+IlRH9sENBO0LRn5q+8nbTov4+1p" crossorigin="anonymous">script>body>html>

home.html

{% extends "base.html" %}{% block body %}<h1 class="text-center my-5">🚗Vehicles🚗h1><div class="container d-flex justify-content-center align-items-center">    <a class="btn btn-outline-info mb-3" href="{{ url_for('add_vehicle') }}">Add Vehiclesa>div><div class="container">    <table class="table table-dark table-striped">        <thead>        <tr>            <th scope="col">IDth>            <th scope="col">Vehicleth>            <th scope="col">Price(INR)th>            <th scope="col">Created onth>        tr>        thead>        {% if not details%}        <div class="text-center">            <h3 class="my-5">No Records to Display!h3>        div>        {% else %}        <tbody>        {% for data in details %}        <tr>            <th scope="row">{{data.id}}th>            <td>{{data.name}}td>            <td>{{data.price}}td>            <td>{{data.created_at}}td>        tr>        {% endfor %}        tbody>        {% endif %}    table>div>{% endblock %}

vehicle.html

{% extends "base.html" %}{% block body %}<h1 class="text-center my-5">🚗Vehicle Details🚗h1><div class="container">    <a class="btn mb-3 btn-outline-info" href="{{ url_for('home') }}">Go to Homea>    <form action="/add-vehicle" method="POST">        <div class="mb-3">            <label class="form-label" for="vehicle">Vehiclelabel>            <input class="form-control" id="vehicle" name="vehicle" placeholder="Vehicle name with model and brand"                   required type="text">        div>        <div class="mb-3">            <label class="form-label" for="price">Pricelabel>            <input class="form-control" id="price" name="price" placeholder="Price of the vehicle in INR"                    required type="text">        div>        <button class="btn mt-3 btn-outline-success" type="submit">Add Vehiclebutton>    form>div>{% endblock%}

Creating Routes and Logic

""" app.py file """# Previous Code Here""" Creating Routes """@app.route("/")def home():    details = Vehicle.query.all()    return render_template("home.html", details=details)@app.route("/add-vehicle", methods=['GET', 'POST'])def add_vehicle():    if request.method == 'POST':        v_name = request.form.get('vehicle')        price = request.form.get('price')        add_detail = Vehicle(            name=v_name,            price=price        )        db.session.add(add_detail)        db.session.commit()        return redirect(url_for('home'))    return render_template("vehicle.html")if __name__ == "__main__":    from models import Vehicle    create_db()    app.run(debug=True) # This line of code is added in this section

First of all, add the necessary imports in the app.py file: render_template, request, redirect, and url_for.

The provided code establishes two distinct routes: the home or "/" route and the /add-vehicle route.

The main page route (@app.route("/")) uses a function called home(). This function retrieves all vehicle information from the database using the Vehicle model and shows it on the home.html page.

The /add-vehicle route can handle both GET and POST requests. The add_vehicle() function is responsible for displaying the vehicle.html page. It also handles the process of taking user data from a submitted form, adding it to the database using db.session.add(), confirming the changes with db.session.commit(), and sending the user back to the main page through redirect(url_for('home')).

Testing

Run the Flask app directly from the IDE or as a Python module to see the app hosted on http://127.0.0.1:5000, or manually enter http://localhost:5000 in the browser's address bar.

The following image shows the first glimpse of the homepage.

To add vehicle information, use the "Add Vehicles" button or the /add-vehicle route.

After you submit the form, the vehicle information will be displayed on the homepage, to which you will be automatically redirected.

Source Code

Access the full source code written in this article from the following GitHub repository and clone or download it and run the application to try it out.

FlaskxSQLiteDB_Vehicle_App

Conclusion

SQLAlchemy is used to create an SQLite database and integrated with the Flask app to interact with the database. A simple application was created in which a form is integrated to get the data from the user and add it to the database and then display it on the homepage of the application.

This article has walked you through the steps in making the SQLite database using Flask-SQLAlchemy and integrating it with the Flask app.

🏆Other articles you might be interested in if you liked this one

How to connect the PostgreSQL database with Python?

How to create a database in Appwrite using Python?

Upload and display images on the frontend using Flask in Python.

Building a Flask image recognition webapp using a deep learning model.

How to Integrate TailwindCSS with Flask?

What are Public, Protected, and Private access modifiers in Python?

That's all for now

Keep Coding

How to Create a Database in Appwrite Using Python

Sachin Pal — Sun, 30 Jul 2023 06:58:46 GMT

Appwrite is an open-source backend platform that reduces a developer's effort and time spent building a backend server from scratch. It is a backend-as-a-service solution that handles backend tasks for web, mobile, and Flutter apps.

Appwrite offers databases, authentication, storage, real-time communication, and many other services.

Creating and configuring a database in a web app is time-consuming, and this tutorial will walk you through the process of doing so in simple steps using Appwrite's Python SDK (Software Development Kit) or package.

If you don't want to deal with preliminary tasks before setting up the database in the Appwrite cloud, you can skip to the database creation section.

Collecting Required Info

Before you begin creating a database on the Appwrite cloud, you must first obtain the following credentials from the Appwrite console:

Appwrite API key
Appwrite Project ID
Appwrite Cloud API Endpoint

Step 1: Obtaining Project ID

To gain access to the Appwrite console, create an account on the Appwrite cloud (https://cloud.appwrite.io/) or login if you already have one.
Click the "Create project" button, give your project a name, and then save the "Project ID".

Step 2: Obtaining API Key

After you've created a new project, scroll down and click the "API Key" button.
Fill in the API name and select the expiration time, then click the "Next" button, then add the scope "Database", and finally click the "Create" button.

Scroll down the Appwrite console and click on "API Keys" under the "Integrations" section, followed by your newly created API name.
Copy the project's API Key by clicking on the "API Key Secret" button.

Installing Appwrite's Python Package

Using Python to create a database on the Appwrite cloud requires a Python package called appwrite, which provides API access to interact with the Appwrite backend and perform various tasks.

Open a terminal window and type the following command to install the Python package appwrite with pip.

pip install appwrite

Creating a Database

Creating a database on the Appwrite cloud involves simple steps. With all the required credentials gathered, let's create a database using Python.

Importing Required Modules

from appwrite.client import Clientfrom appwrite.services.databases import Databasesfrom appwrite.id import IDimport osfrom dotenv import load_dotenv

from appwrite.client import Client: To interact with the Appwrite API, the Client class is imported from the appwrite.client module. The Client class will allow you to configure the API key, Project ID, and API endpoint for making Appwrite backend requests.

from appwrite.services.databases import Databases: To work with Database on the Appwrite cloud, the Databases class is imported from the appwrite.services.databases module.

from appwrite.id import ID: To generate unique IDs that can be used within Appwrite, the ID class is imported from the appwrite.id module.

import os: To interact with the operating system, the os module is imported.

from dotenv import load_dotenv: To load environment variables to the Python file. The library can be installed with pip install python-dotenv, if it is not installed.

Configuring the Appwrite Client for API Access

""" Configuring Appwrite Client """# Instantiating Appwrite Clientclient = Client()# To load environment variablesload_dotenv()# Configuring Appwrite Client(client # Setting API Endpoint .set_endpoint('https://cloud.appwrite.io/v1') # Setting Project ID .set_project(os.getenv('PROJECT_ID')) # Setting API Key .set_key(os.getenv('API_KEY')) )

Instantiating Appwrite Client: The Client class instance is created and stored in the client variable.

The environment variables (PROJECT_ID and API_KEY) are loaded into the script using the load_dotenv() function.

Setting API Endpoint: The Appwrite client's set_endpoint() method is used to set the API endpoint to the URL 'https://cloud.appwrite.io/v1'.

Setting Project ID: The Appwrite client's set_project() method is used to set the Project ID of the project by retrieving it from the environment variable 'PROJECT ID' using os.getenv('PROJECT ID').

Setting API Key: Similarly, the API key is set with the set_key() method, and it was retrieved from the environment variable 'API_KEY' with os.getenv('API_KEY').

Creating a New Database

""" Creating Database """# Initializing databases servicedatabases = Databases(client)# To generate unique database IDdb_id = ID.unique()# Creating a new databasecreate_db = databases.create(db_id, 'BooksDB')print("Database Successfully Created.")

The Databases instance is created with Databases(client) and saved in the databases variable. This enables interaction with the Appwrite Database API and the execution of various database-related tasks.

The ID.unique() method is used to generate a unique ID for the database, which is then stored in the db_id variable.

The code then creates a database by calling the databases.create() method, which takes two parameters: the database ID in this case, db_id, and the database name, which in this case is 'BooksDB'.

If you run the file, the database will be created and will be visible on the Appwrite cloud.

Creating a database is not sufficient, especially if you intend to connect it to a web app. It's required for CRUD operations like adding new data, updating it, reading it, and even deleting it.

To create a fully functional database for data storage, the following must be created:

Collections
Attributes
Documents

Creating Collections

Collections in the Appwrite database are data storage containers similar to tables in traditional databases.

You can create multiple collections in a single Appwrite database to store and manage data from various sources, which will aid in data management.

"""Creating Collections"""# Database IDdatabase_id = create_db['$id']# For Generating Unique Collection IDcollection_id = ID.unique()# Creating a New Collectionnew_collection = databases.create_collection(database_id=database_id,                                             collection_id=collection_id,                                             name='Books')print('Collection Successfully Created.')

The database ID is retrieved using create_db['$id'] and stored in the database_id variable.

The ID.unique() method is used to generate a unique ID for the collection in the database and the resulting ID is stored in the collection_id variable.

To create a new collection, the databases.create_collection() method is used. It accepts three required arguments: database_id, which represents the ID of the database where the collection will be created, collection_id, a unique ID to ensure no conflicts with existing collections, and name, which specifies the name of the new collection, which is "Books" in this case.

Creating Attributes

Following the completion of collections, the next step is to create attributes. The attributes define the schema of the data. Attributes are of different types, you can choose as per requirement.

Attributes are similar to fields in a traditional database table, where data is stored under the respective field. This ensures a standardized structure of the documents in the Appwrite database.

Since the database is for Book details, the schema will be as follows:

id - integer: Used to store the book's ID.
image - url: Book image
title - string: The title of the book
author - string: The author of the book
genre - string: The book's genre

The attributes will be generated based on the fields listed above.

"""Creating Attributes"""# Collection ID of Bookc_id = new_collection['$id']""" Creating integer attribute """# ID Attributebook_id = databases.create_integer_attribute(database_id=database_id,                                             collection_id=c_id,                                             key="id",                                             required=True)""" Creating url attribute """# URL Attributebook_url = databases.create_url_attribute(database_id=database_id,                                          collection_id=c_id,                                          key="image",                                          required=True)""" Creating string attribute """# Title Attributebook_title = databases.create_string_attribute(database_id=database_id,                                               collection_id=c_id,                                               key="title",                                               required=True,                                               size=100)# Author Attributebook_author = databases.create_string_attribute(database_id=database_id,                                                collection_id=c_id,                                                key="author",                                                required=True,                                                size=50)# Genre Attributebook_genre = databases.create_string_attribute(database_id=database_id,                                               collection_id=c_id,                                               key="genre",                                               required=True,                                               size=50)print("Attributes Successfully Created.")

The create_integer_attribute() method is used to create the integer attribute "id" for the "Books" collection. This method is invoked with four mandatory arguments: database_id (set to database_id), collection_id (set to c_id), key (set to "id" as the attribute name), and required (set to True to indicate that the value of this attribute cannot be left empty).

The create_url_attribute() method is used to create the URL attribute "image" for the "Books" collection. This method is also called with four mandatory arguments: database_id, collection_id, key set to "image", and required parameter set to True.

Finally, the create_string_attribute() method is used to create three string attributes, namely "title", "author", and "genre" for the "Books" collection. This method requires five parameters: database_id, collection_id, key, required, and size, which is the maximum number of characters allowed for the attribute.

Now that all processes have been completed and the database is ready, you can add data from the frontend and store it as documents within the database.

Adding Documents

Following the schema that was created in the previous section, data will be added programmatically within the collection called "Books" in the database called "BooksDB" in this section.

""" Adding Documents """# Unique Identifier for Document IDdocument_id = ID.unique()""" Function for Adding Documents(data) in the Database """def add_doc(document):    try:        doc = databases.create_document(            database_id=database_id,            collection_id=c_id,            document_id=document_id,            data=document        )        print("Id:", doc['id'])        print("Image:", doc['image'])        print("Title:", doc['title'])        print("Author:", doc['author'])        print("Genre:", doc['genre'])        print("-" * 20)    except Exception as e:        print(e)

The code defines an add_doc function that takes a single argument named document. Within the function's try block, the databases.create_document() method is invoked to generate a document, which is subsequently stored in the doc variable.

This method requires four mandatory parameters: database_id (set as database_id), collection_id (set as c_id), document_id (representing a unique ID for the document, specifically assigned as document_id), and data (representing the document's content formatted in a dictionary).

The doc variable stores a dictionary returned by the databases.create_document(). Utilizing this doc variable, the code retrieves the id, image, title, author, and genre of the added document and prints them.

If an exception arises during the try block's execution, the code captures the error and displays an error message.

Data

# Data To Be Addedbook_1 = {    "id": 1,    "image": "https://i.pinimg.com/474x/dc/17/2d/dc172d6fa3f5461d94e6d384aded2cb4.jpg",    "title": "The Great Gatsby",    "author": "F. Scott Fitzgerald",    "genre": "Fiction"}book_2 = {    "id": 2,    "image": "https://i.pinimg.com/originals/0b/bf/b5/0bbfb59b4d5592e2e7fac9930012ce6d.jpg",    "title": "To Kill a Mockingbird",    "author": "Harper Lee",    "genre": "Fiction"}book_3 = {    "id": 3,    "image": "https://i.pinimg.com/736x/66/1d/17/661d179ab722e67eed274d24b8965b0d.jpg",    "title": "Pride and Prejudice",    "author": "Jane Austen",    "genre": "Romance"}book_4 = {    "id": 4,    "image": "https://i.pinimg.com/originals/68/c5/4c/68c54c9599ba37d9ab98c0c51afe2298.png",    "title": "Crime and Punishment",    "author": "Fyodor Dostoevsky",    "genre": "Psychological Fiction"}# Calling function with the data to be addedadd_doc(book_1)add_doc(book_2)add_doc(book_3)add_doc(book_4)print("Documents Successfully Added.")

When you run the whole code, the following output will be prompted and the database, collection, and attributes will be created and documents will then be added.

Database Successfully Created.Collection Successfully Created.Attributes Successfully Created.Id: 1Image: https://i.pinimg.com/474x/dc/17/2d/dc172d6fa3f5461d94e6d384aded2cb4.jpgTitle: The Great GatsbyAuthor: F. Scott FitzgeraldGenre: Fiction--------------------Id: 2Image: https://i.pinimg.com/originals/0b/bf/b5/0bbfb59b4d5592e2e7fac9930012ce6d.jpgTitle: To Kill a MockingbirdAuthor: Harper LeeGenre: Fiction--------------------Id: 3Image: https://i.pinimg.com/736x/66/1d/17/661d179ab722e67eed274d24b8965b0d.jpgTitle: Pride and PrejudiceAuthor: Jane AustenGenre: Romance--------------------Id: 4Image: https://i.pinimg.com/originals/68/c5/4c/68c54c9599ba37d9ab98c0c51afe2298.pngTitle: Crime and PunishmentAuthor: Fyodor DostoevskyGenre: Psychological Fiction--------------------Documents Successfully Added.

Source Code

Access the full source code from the following GitHub repository, clone or download the code and run it in your favorite IDE.

Python Script For Creating Appwrite Database

Conclusion

The tutorial walked you through the steps of setting up a new database in the Appwrite cloud. It also includes instructions for creating a new project, creating an API key for the project, and obtaining the project ID and API key from the Appwrite cloud.

Following the creation of the database, the tutorial takes you through the steps of making it fully functional by adding collections and attributes. The documents (data) are then added programmatically.

Let's go over the steps in this tutorial for creating a new database:

Obtaining the necessary Appwrite cloud credentials
Installing the Python package appwrite
Making a database
Making a collection
Adding the attributes
Adding the documents programmatically

🏆Other articles you might be interested in if you liked this one

How to connect the PostgreSQL database with Python?

Upload and display images on the frontend using Flask in Python.

Building a Flask image recognition webapp using a deep learning model.

Building a custom deep learning model using transfer learning.

How to augment the data for training using Keras and Python?

How to build a CLI command in a few steps using argparse in Python?

How to use Async/Await like JavaScript in Python?

How to scrape a webpage's content using BeautifulSoup in Python?

That's all for now

Keep Coding

Comparing Files and Directories Using filecmp Module in Python

Sachin Pal — Mon, 24 Jul 2023 12:55:41 GMT

You've probably heard of the filecmp module, which provides functions for programmatically comparing files and directories.

Comparing Files

The filecmp module includes a function called cmp() that compares two files and returns True if they are equal, False otherwise.

Syntax

filecmp.cmp(f1, f2, shallow=True)

Parameters -

f1: First filename

f2: Second filename

shallow: If set to True and the information(os.stat signatures) of the file are identical, the files are considered equal.

Comparing Files Using cmp()

import filecmpcompare = filecmp.cmp('test_file1.txt', 'test_file2.txt')print(compare)----------True

Both files (test_file1.txt and test_file2.txt) have the same content, size, and permissions, that's why the above code returned True.

Most information in both files would be similar if you used the os.stat() function to compare them.

stat1 = os.stat('test_file1.txt')print("Information: test_file1.txt")print(stat1)stat2 = os.stat('test_file2.txt')print("Information: test_file2.txt")print(stat2)

Some os.stat() function attributes will be the same in both files.

Information: test_file1.txtos.stat_result(st_mode=33206, st_ino=6473924465395070, st_dev=3836766283, st_nlink=1, st_uid=0, st_gid=0, st_size=20, st_atime=1689869596, st_mtime=1689856217, st_ctime=1689856083)Information: test_file2.txtos.stat_result(st_mode=33206, st_ino=2814749768156544, st_dev=3836766283, st_nlink=1, st_uid=0, st_gid=0, st_size=20, st_atime=1689869596, st_mtime=1689856277, st_ctime=1689856094)

The output shows that the status of both files is similar in terms of st_mode (permissions) and st_size (file size).

Comparing Files Having Different Info

import filecmpfile_path1 = 'test_file1.txt'file_path2 = 'D:/SACHIN/Pycharm/file_handling/test.txt'compare = filecmp.cmp(file_path1, file_path2, shallow=True)print(compare)----------False

The above code returned False because the contents of both files differ, as does the file size.

Comparing Files From Different Directories

Files from two different directories can be compared using the filecmp.cmpfiles() function.

The function compares the common files in the directories specified and returns three results.

match: A list of filenames that are shared by both directories and have the same content.
mismatch: A list of filenames that are shared by both directories but contain different content.
errors: A list of filenames that were unable to be compared.

Syntax

filecmp.cmpfiles(dir1, dir2, common, shallow=True)

Parameters -

dir1: First directory path

dir2: Second directory path

common: A list of filenames from dir1 and dir2

shallow: If set to True and the information(os.stat signatures) of the file are identical, the files are considered equal.

For this section, consider the following directory structure with two directories called first_dir and second_dir and the following filenames:

Example

import filecmpfile_dir1 = 'first_dir'file_dir2 = 'second_dir'common_files = ['basic.txt', 'demo.txt', 'sample.txt', 'test.txt']matched, mismatch, not_compared = filecmp.cmpfiles(file_dir1,                                                    file_dir2,                                                    common=common_files)print(f"Matched: {matched}")print(f"Unmatched: {mismatch}")print(f"Unable to Compare: {not_compared}")

The paths to both directories were specified in the above code, and the list of filenames to be compared was saved in the variable common_files.

The filecmp.cmpfiles() function was then called, and the directories and list of filenames were passed inside the function and assigned to three variables: matched, mismatch, and not_compared. The results were then printed.

Matched: ['sample.txt', 'test.txt']Unmatched: ['demo.txt']Unable to Compare: ['basic.txt']

The filenames sample.txt and test.txt matched because they have the same content and are found in both directories. The demo.txt file does not match due to different content, and the basic.txt file cannot be compared because one of the directories lacks the basic.txt file to compare with.

dircmp - Perform Directory Comparisons on Various Factors

The filecmp.dircmp() is used to create a dircmp object by passing the directories' paths to be compared. The dircmp class contains numerous methods and attributes that allow you to compare, analyze, differ, handle subdirectories, and much more by calling on the dircmp object.

Syntax

filecmp.dircmp(a, b, ignore=None, hide=None)

Parameters -

a: First directory path
b: Second directory path
ignore: Specifies the list of filenames to be ignored during comparison.
hide: Specifies the list of filenames to hide in the output.

Creating a dircmp Object

import filecmpfile_dir1 = 'first_dir'file_dir2 = 'second_dir'dircmp_obj = filecmp.dircmp(file_dir1, file_dir2)print(dircmp_obj)----------0x000001FE7ECF5A80>

The dircmp object is created by invoking filecmp.dircmp() with the paths to the directories to be compared (file_dir1 and file_dir2). By calling the methods and attributes on dircmp_obj, the directories can now be compared on various criteria.

Generating Comparison Report

The report() method generates a report comparing the specified directories.

dircmp_obj.report()----------diff first_dir second_dirOnly in second_dir : ['basic.txt']Identical files : ['sample.txt', 'test.txt']Differing files : ['demo.txt']

Calling report() on dircmp_obj compared the two directories, revealing that sample.txt and test.txt files were identical, the basic.txt file was only found in the second_dir directory, and demo.txt files were found in both directories but their contents differ.

Identifying Missing Files

The left_only and right_only attributes can be used to display filenames that are only found in the left (a) or right (b) directories. In simple words, you can find which file is present in one directory but missing in another directory.

# Displaying filenames that are only present in left_dirfilenames_only_in_left_dir = dircmp_obj.left_onlyprint(f"Filenames Only in Left Directory: {filenames_only_in_left_dir}")# Displaying filenames that are only present in right_dirfilenames_only_in_right_dir = dircmp_obj.right_onlyprint(f"Filenames Only in Right Directory: {filenames_only_in_right_dir}")----------Filenames Only in Left Directory: []Filenames Only in Right Directory: ['basic.txt']

The output above shows that the basic.txt file is missing in the left directory (first_dir), but it exists in the right directory (second_dir).

Listing Filenames

The left_list and right_list can be used to list the filenames present in the left and right directories.

# Listing filenames in left_dirfilenames_in_left_dir = dircmp_obj.left_listprint(f"Filenames in Left Directory: {filenames_in_left_dir}")# Listing filenames in right_dirfilenames_in_right_dir = dircmp_obj.right_listprint(f"Filenames in Right Directory: {filenames_in_right_dir}")

Output

Filenames in Left Directory: ['demo.txt', 'sample.txt', 'test.txt']Filenames in Right Directory: ['basic.txt', 'demo.txt', 'sample.txt', 'test.txt']

Similarly, the left and right attributes can be used to show the path of the left and right directories.

left_dir_path = dircmp_obj.leftprint(f"Path of Left Directory: {left_dir_path}")right_dir_path = dircmp_obj.rightprint(f"Path of Right Directory: {right_dir_path}")----------Path of Left Directory: first_dirPath of Right Directory: second_dir

Analyzing Files

# Displaying common files and subdirectoriescommon_files_dir = dircmp_obj.commonprint(f"Common Files and Subdirectories: {common_files_dir}")# Displaying common filescommon_files = dircmp_obj.common_filesprint(f"Common Files: {common_files}")# Displaying common directoriescommon_directories = dircmp_obj.common_dirsprint(f"Common Directories: {common_directories}")# Displaying same filessame_files = dircmp_obj.same_filesprint(f"Same Files: {same_files}")# Displaying differ filesdiffer_files = dircmp_obj.diff_filesprint(f"Unmatched Files: {differ_files}")

Output

Common Files and Subdirectories: ['demo.txt', 'sample.txt', 'test.txt']Common Files: ['demo.txt', 'sample.txt', 'test.txt']Common Directories: []Same Files: ['sample.txt', 'test.txt']Unmatched Files: ['demo.txt']

By examining the output:

common returns a list of files and subdirectories that are shared by both directories.
common_files returns the list of files that are shared by both directories.
common_dirs returns a list of directories that are shared by both directories.
same_files returns a list of filenames that can be found in both directories and have the same content.
diff_files returns a list of filenames that exist in both directories but have different contents.

Ignoring and Hiding Comparison of Files

If you wanted to ignore or hide any files from being compared, the filecmp.dircmp has parameters named ignore (a list of filenames to ignore) and hide (a list of filenames to hide).

import filecmpfile_dir1 = 'first_dir'file_dir2 = 'second_dir'# Filename to ignoreignore = ['demo.txt']# Filename to hidehide = ['basic.txt']# Creating dircmp objectdircmp_obj = filecmp.dircmp(file_dir1, file_dir2, ignore=ignore, hide=hide)# Generating comparison reportdircmp_obj.report()# Listing the filenames in left directoryfilenames_in_left_dir = dircmp_obj.left_listprint(f"Filenames in Left Directory: {filenames_in_left_dir}")# Listing the filenames in right directoryfilenames_in_right_dir = dircmp_obj.right_listprint(f"Filenames in Right Directory: {filenames_in_right_dir}")

Output

diff first_dir second_dirIdentical files : ['sample.txt', 'test.txt']Filenames in Left Directory: ['sample.txt', 'test.txt']Filenames in Right Directory: ['sample.txt', 'test.txt']

Both directories' demo.txt files were ignored, and the basic.txt file was hidden from comparison.

Clearing Cache

The filecmp module includes a function called clear_cache() that allows you to clear the internal cache used by the filecmp module.

When a file is modified and then compared in such a short period of time that the rounded-off modification time is nearly the same as the comparison time, the program may conclude that the files are identical.

Sometimes certain situations may arise where you may get stuck while comparing files and getting odd results, in that case, you can give it a try to filecmp.clear_cache() function to clear any cache.

Consider the following example, in which the cache is stored after comparing the two image files and then clearing the internal cache with the filecmp.clear_cache() function.

import filecmpfile_dir1 = 'D:/SACHIN/Desktop/rise.png'file_dir2 = 'D:/SACHIN/Desktop/media/rise.png'# Comparing image filecompare = filecmp.cmp(file_dir1, file_dir2, shallow=False)print(compare)# Printing the cache stored by filecmpprint(filecmp._cache)# Clearing cachefilecmp.clear_cache()print(filecmp._cache)# Checking if cache is cleared or notassert len(filecmp._cache) == 0, 'Cache not cleared'

The assert statement was written at the end of the code snippet to ensure that the cache is cleared (the module's protected variable _cache is emptied properly), and if it is not, a message 'Cache not cleared' is displayed.

True{('D:/SACHIN/Desktop/rise.png', 'D:/SACHIN/Desktop/media/rise.png', (32768, 6516, 1689779926.7445374), (32768, 6516, 1689779926.7445374)): True}{}

Conclusion

The filecmp module provides functions such as cmp() and cmpfiles() for comparing various types of files and directories, and the dircmp class provides numerous methods and attributes for comparing the files and directories on various factors.

Let's recall what you've learned:

Comparing two different files
Files from two different directories are being compared.
The dircmp class and its methods and attributes are used to summarise, analyze, and generate reports on files and directories.
Clearing the internal cache stored by the filecmp module using the filecmp.clear_cache() function.

🏆Other articles you might be interested in if you liked this one

How to read multiple files simultaneously using fileinput module in Python?

Generate temporary files and directories using tempfile module in Python.

assert statement - Debug your code using assert statements in Python.

Understanding the different uses of asterisk(*) in Python.

What is the difference between seek() and tell() in Python?

How to use match-case statements for pattern matching in Python?

__init__ vs __new__ methods in Python.

How to manipulate paths using the pathlib module in Python?

That's all for now

Keep Coding

How to Read Multiple Files Simultaneously With fileinput Module In Python

Sachin Pal — Thu, 20 Jul 2023 15:21:59 GMT

The fileinput module is a part of the standard library and is used when someone needs to iterate the contents of multiple files simultaneously. Well, Python's in-built open() function can also be used for iterating the content but for only one file at a time.

You'll explore the classes and functions provided by the fileinput module to iterate over multiple files.

But one thing, you could use fileinput to iterate the single file also, but it would be better to use the open() function for it.

Basic Usage

import fileinput# Creating fileinput instance and passing multiple filesstream = fileinput.input(files=('test.txt',                                'sample.txt',                                'index.html'))# Iterating the contentfor data in stream:    print(data)

The fileinput module was first imported, and then the fileinput instance was created by calling fileinput.input() and passing the tuple of files (test.txt, sample.txt, and index.html). This will result in the return of an iterator.

The contents of the files were then iterated and printed using the for loop.

Hi, I am a test file.Hi, I am a sample file for testing."en">    Test HTML FileHi, I am a simple HTML File.

Another approach would be to use the fileinput module as a context manager. This method is somewhat safe because it ensures that the fileinput instance is closed even if an exception occurs.

import fileinputwith fileinput.input(files=('test.txt', 'sample.txt')) as files:    for data in files:        print(data)

In the above demonstration, the fileinput module was used as a context manager with the 'with' statement.

The above code will return an iterator and will assign it to the files variable (due to the as clause) then the data will be iterated using the files variable.

Hi, I am a test file.Hi, I am a sample file for testing.

The fileinput.input() Function

The fileinput.input() function is the primary interface of the fileinput module, by using it, the purpose of using the fileinput module is nearly fulfilled. You saw a glimpse of the fileinput.input() function in the previous section, this time, you'll learn more about it.

Syntax

fileinput.input(files=None, inplace=False, backup='', mode='r', openhook=None, encoding=None, errors=None)

Parameters:

files: Defaults to None. Takes a single file or multiple files to be processed.

inplace: Defaults to False. When set to True, the files can be modified directly.

backup: Defaults to an empty string. The extension is specified for the backup files when inplace is set to True.

mode: Default to read mode. This can only open files in read mode hence, we can open the file in r, rb, rU, and U.

openhook: Defaults to None. A custom function for controlling how files are opened.

encoding: Defaults to None. Specifies the encoding to be used to read the files.

errors: Defaults to None. Specifies how the errors should be handled.

Modifying the Files Before Reading

import fileinputwith fileinput.input(files=('test.txt', 'sample.txt'), inplace=True) as files:    for data in files:        modified_content = data.lower()        print(modified_content)

The parameter inplace is set to True in the above code, which enables the editing of the file before reading.

The upper code will lowercase the content present inside both files (test.txt and sample.txt).

Storing Backup of Files

When the inplace parameter is set to True, the original files can be edited, but the original state of the files can be saved in another file using the backup parameter.

import fileinputwith fileinput.input(files=('test.txt', 'sample.txt'),                     inplace=True, backup='.bak') as files:    for data in files:        modified_content = data.capitalize()        print(modified_content)

The above code will capitalize the content and the original files will be saved as test.txt.bak and sample.txt.bak due to the backup='.bak'.

Controlling the Opening of the File

import fileinputdef custom_open(filename, mode):    data = open(filename, "a+")    data.write(" Data added through function.")    return open(filename, mode)with fileinput.input(files=("test.txt", "sample.txt"), openhook=custom_open) as file:    for data in file:        print(data)

The custom_open() function is defined that takes two parameters filename and mode. The function opens the file in append + read mode and then writes the string and returns the file object.

The hook must be a function that takes two arguments, filename and mode, and returns an accordingly opened file-like object.Source

The files are then passed to the fileinput.input() function, and the openhook parameter is set to custom_open. The custom_open() function will be in charge of opening the files. The file content was iterated and printed.

Hi, i am a test file. Data added through function.Hi, i am a sample file for testing. Data added through function.

Reading Unicode Characters

You have a file having Unicode characters and need to read that file, to read Unicode characters, specific encodings are used.

with fileinput.input(files=('test_unicode.txt'), encoding='utf-8') as files:    for data in files:        print(data)

The UTF-8 encoding can be used to read the Unicode characters, hence, the encoding parameter is set to utf-8 encoding.

😁😂😅

Handling Errors

To handle the error, use the errors parameter. Take the above code as an example: if the encoding was not specified, the code would throw a UnicodeError.

with fileinput.input(files=('test_bin.txt'), errors='ignore') as files:    for data in files:        print(data)----------

The errors parameter is set to ignore, which means that the error will be ignored. The errors parameter can also be set to strict (raise an exception if an error occurs) or replace (replace an error with a specified error).

Functions to Access Input File Information

There are some functions that can be used to access the information of the input files which are being processed using the fileinput.input() function.

Getting the File Names

Using the fileinput.filename() function, the name of the currently processed files can be displayed.

with fileinput.input(files=('test.txt', 'sample.txt')) as files:    for data in files:        print(f"File: {fileinput.filename()}")        print(data)

Output

File: test.txtHi, i am a test file. Data added through function. Added data to the file.File: sample.txtHi, i am a sample file for testing. Data added through function. Added data to the file.

Getting the File Descriptor and Line and File Line Number

The fileinput.fileno() function returns the active file's file descriptor, the fileinput.lineno() function returns the cumulative line number, and the fileinput.filelineno() function returns the line number of the currently processed file.

with fileinput.input(files=('test.txt', 'sample.txt')) as files:    for data in files:        print(f"{fileinput.filename()}'s File Descriptor: {fileinput.fileno()}")        print(f"{fileinput.filename()}'s File Line Number: {fileinput.filelineno()}")        print(f"{fileinput.filename()}'s File Cumulative Line No.: {fileinput.lineno()}")

Output

test.txt's File Descriptor: 3test.txt's File Line Number: 1test.txt's File Cumulative Line No.: 1sample.txt's File Descriptor: 3sample.txt's File Line Number: 1sample.txt's File Cumulative Line No.: 2

Checking Reading Status

with fileinput.input(files=('test.txt', 'sample.txt')) as files:    for data in files:        print(f"Read First Line: {fileinput.isfirstline()}")        print(f"Last Line Read From sys.stdin: {fileinput.isstdin()}")----------Read First Line: TrueLast Line Read From sys.stdin: FalseRead First Line: TrueLast Line Read From sys.stdin: False

The fileinput.isfirstline() function returns True if the line read from the current file is the first line otherwise returns False, since both files contain a single line, it returned True.

When the last line of the input file was read from sys.stdin, the fileinput.isstdin() function returns True, otherwise, it returns False.

Closing the File

When using fileinput.input() function as the context manager with the with statement, the file closes anyway but fileinput.close() function is also used to close the resources when the work is done.

import fileinputwith fileinput.input(files=('test.txt', 'sample.txt')) as file:    for data in file:        if data > data[:26]:            fileinput.close()            print('File has more than 25 characters.')        else:            print(data)

The above code demonstrates the use of the fileinput.close() function, which closes the file if it contains more than 25 characters and prints a message otherwise the content is printed.

File has more than 25 characters.

However, because the file contained more than 25 characters, the file was closed and the message was printed.

The FileInput Class

The fileinput.FileInput class is an object-oriented alternative to the fileinput.input() function. The parameters are identical to those of the input() function.

Syntax

fileinput.FileInput(files=None, inplace=False, backup='', mode='r', openhook=None, encoding=None, errors=None)

Example

import fileinputclass OpenMultipleFiles:    def __init__(self, *args):        self.args = args    def custom_open(self, filename, mode):        data = open(filename, "a+")        data.write(" Added data to the file.")        return open(filename, mode)    def read(self):        with fileinput.FileInput(files=(self.args), openhook=OpenMultipleFiles().custom_open) as file:            for data in file:                print(data)obj = OpenMultipleFiles('test.txt', 'sample.txt')obj.read()

The class OpenMultipleFiles is defined in the above code. The class has an __init__ method that takes variadic arguments.

A custom_open method is defined within the class that opens the file in append+read mode, writes some data to the file, and returns the file object.

The read method is defined and within the read method the instance of the fileinput.FileInput is created and passed the self.args as the files argument and the openhook parameter is set to OpenMultipleFiles().custom_open. The contents of the files are then iterated and printed.

Finally, the OpenMultipleFiles class instance is created and passed the file names (test.txt and sample.txt) and stored within the obj variable. The read method is then invoked on the obj to read the specified files.

Hi, i am a test file. Data added through function. Added data to the file.Hi, i am a sample file for testing. Data added through function. Added data to the file.

Comparison

Let's see how long it takes to process the contents of multiple files at the same time using the open() and the fileinput.input() function.

import timeit# open() Function Codecode = '''with open('test.txt') as f1, open('sample.txt') as f2:    f1.read()    f2.read()'''print(f"Open Function Benchmark: {timeit.timeit(stmt=code, number=1000)}")# fileinput Codesetup = 'import fileinput'code = '''with fileinput.input(files=('test.txt', 'sample.txt')) as file:    for data in file:        data'''print(f"Fileinput Benchmark: {timeit.timeit(setup=setup, stmt=code, number=1000)}")

Using the timeit module, the above code measures the time it takes to process the contents of multiple files 1000 times for the fileinput.input() function and open() function. This method will aid in determining which is more efficient.

Open Function Benchmark: 0.3948998999840114Fileinput Benchmark: 0.4962893000047188

Limitations

Every module is powerful in its own right, but it also has limitations, such as the fileinput module.

It does not read files, instead, it iterates through the contents of the file line by line and prints the results.
Cannot write or append the data into the files.
Cannot perform advanced file-handling operations.
Less performant because the program's performance may suffer when processing large files.

Conclusion

The fileinput module provides functions to process one or more than one file line by line to read the content. The fileinput.input() function is the primary interface of the fileinput module, and it provides parameters to give you more control over how the files are processed.

Let's recall what you've learned:

An overview of the fileinput module
Basic usage of the fileinput.input() with and without context manager
The fileinput.input() function and its parameters with examples
A glimpse of FileInput class
Comparison of fileinput.input() function with open() function for processing multiple files simultaneously
Some limitations of the fileinput module

🏆Other articles you might be interested in if you liked this one

How to use assert statements for debugging in Python?

Difference between the __init__ and __new__ methods.

What is context manager and the 'with' statement in Python?

How to implement getitem, setitem, and delitem in Python classes?

How to perform unit testing using the unittest module in Python?

File handling in Python - Opening, Reading, and much more.

Public, Protected, and Private access modifiers in Python.

That's all for now

Keep coding

Join, Merge, and Combine Multiple Datasets Using pandas

Sachin Pal — Wed, 05 Jul 2023 15:52:05 GMT

Data processing becomes critical when training a robust machine learning model. We occasionally need to restructure and add new data to the datasets to increase the efficiency of the data.

We'll look at how to combine multiple datasets and merge multiple datasets with the same and different column names in this article. We'll use the pandas library's following functions to carry out these operations.

pandas.concat()
pandas.merge()
pandas.DataFrame.join()

Preparing Sample Data

We'll create sample datasets using pandas.DataFrame() function and then perform concatenating operations on them.

The code in the image will generate two datasets from data and data1 using pd.DataFrame(data) and pd.DataFrame(data1) and store them in the variables df1 and df2.

Then, using the .to_csv() function, df1 and df2 will be saved in the CSV format as 'employee.csv' and 'employee1.csv' respectively.

Here, the data that we created looks as shown in the following image.

Combining Data Using concat()

We can use the pandas library to analyze, modify, and do other things with our CSV (comma-separated value) data. The library includes the concat() function which we will use to perform the concatenation of multiple datasets.

There are two axes on which the datasets can be concatenated: the row axis and the column axis.

Combine Data Along the Row Axis

We previously created two datasets named 'employee.csv' and 'employee1.csv'. We'll concatenate them horizontally, which means the data will be spliced across the rows.

combine = pd.concat([dt, dt1])

The above code demonstrates the basic use of the concat() function. We passed a list of datasets(objects) that will be combined along the row axis by default.

The concat() function accepts some parameters that affect the concatenation of the data.

The indices of the data are taken from their corresponding data, as seen in the above output. How do we create a new data index?

The `ignore_index` Parameter

When ignore_index=True is set, a new index from 0 to n-1 is created. The default value is False, which is why the indices were repeated in the above example.

set_index = pd.concat([dt, dt1], ignore_index=True)

As shown in the image above, the dataset contains a new index ranging from 0 to 7.

The `join` Parameter

In the above image, we can see that the first four data points for the Salary and No_of_awards columns are missing.

This is due to the join parameter, which by default is set to "outer" which joins the data exactly as it is. If it is set to "inner", data that does not match another dataset is removed.

inner_join = pd.concat([dt, dt1], join="inner")

The `keys` Parameter

The keys parameter creates an index from the keys which is used to differentiate and identify the original data in the concatenated objects.

keys = pd.concat([dt, dt1], keys=["Dataset1", "Dataset2"])

The datasets were concatenated, and a multi-level index was created, with the first level representing the outermost index (Dataset1 and Dataset2 from the keys) and the second level representing the original index.

Combine Data Along the Column Axis

The datasets were concatenated along the row axis or horizontally in the previous section, but in this approach, we will stitch them vertically or along the column axis using the axis parameter.

The axis parameter is set to 0 or "index" by default, which concatenates the datasets along the row axis, but if we change its value to 1 or "columns", it concatenates the datasets along the column axis.

combine_vertically = pd.concat([dt, dt1], axis="columns")#---------------------------OR---------------------------#combine_vertically = pd.concat([dt, dt1], axis=1)

Merging Data Using merge()

The pandas.merge() function merges data from one or more datasets based on common columns or indices.

We'll operate on a different dataset that we created and contains the information shown in the following image.

The merge() function takes left and right parameters which are datasets to be merged.

The `how` Parameter

We can now specify the type of merge we want to perform on these datasets by providing the how parameter. The how parameter allows for five different types of merges:

inner: Default. It only includes the values that match from both datasets.
outer: It includes all of the values from both datasets but fills the missing values with NaN (Not a Number).
left: It includes all of the values from the left dataset and replaces any missing values in the right dataset with NaN.
right: It includes all of the values from the right dataset and replaces any missing values in the left dataset with NaN.
cross: It creates the Cartesian product which means that the number of rows created will be equal to the product of the row counts of both datasets. If both datasets have four rows, then four times four (4 * 4) equals sixteen (16) rows.

Examples

Performing inner merge

inner_merging = pd.merge(dt1, dt2, how="inner")

We can see that only values with the same Id from both datasets have been included.

Performing outer merge

outer_merging = pd.merge(dt1, dt2, how="outer")

In the case of a outer merge, all of the values from both datasets were included, and the missing fields were filled in with NaN.

Performing left merge

left_merging = pd.merge(dt1, dt2, how="left")

The matching values of the right dataset (dt2) were merged in the left dataset (dt1) and the values of the last four columns (Project_id_final, Age, Salary, and No_of_awards) were not found for A4, so they were filled in with NaN.

Performing right merge

right_merging = pd.merge(dt1, dt2, how="right")

The matching values of the left dataset (dt1) were merged in the right dataset (dt2) and the values of the first five columns (Project_id_initial, Name, Role, Experience, and Qualification) were not found for A6, so they were filled in with NaN.

Cross Merging the Datasets

The how parameter has five different types of merge, one of which is a cross merge.

As previously stated, it generates the Cartesian product, with the number of rows formed equal to the product of row counts from both datasets. Take a look at the illustration below to get a better understanding.

cross_merging = pd.merge(dt1, dt2, how="cross")

Both datasets have four rows each, and each row from dt1 is repeated four times (row count of dt2), resulting in a data set of sixteen rows.

The `on`, `left_on` & `right_on` Parameters

The on parameter accepts the name of a column or index(row) to join on. It could be a single name or a list of names.

The left_on and right_on parameter takes a column or index(row) name from the left and right dataset to join on. They are used when both datasets have different column names to join on.

Merging Datasets on the Same Column

To merge the datasets based on the same column, we can use the on parameter and pass the common column name that both datasets must have.

merging_on_same_column = pd.merge(dt1, dt2, on='Id')

We are merging datasets dt1 and dt2 based on the 'Id' column that they both share.

The matching Id column values from both datasets were merged, and the non-matching values were removed.

Merging Datasets on Different Columns

To merge different columns in the left and right datasets, use the left_on and right_on parameters.

left_right_merging = pd.merge(dt1, dt2, left_on="Project_id_initial", right_on='Project_id_final')

The joining column is the "Project_id_initial" column from the left dataset (dt1) and the "Project_id_final" column from the right dataset (dt2). The values shared by both columns will be used to merge them.

As we can see, the dataset includes both columns, as well as matching rows based on the common values in both the "Project_id_initial" and "Project_id_final" columns.

Changing the Suffix of the Column

If you notice that the merged dataset has two Id columns labeled Id_x and Id_y, this is due to the suffixes parameter, which has default values _x and _y, and when overlapping column names are found in the left and right datasets, they are suffixed with default values.

chg_suffix = pd.merge(dt1, dt2, suffixes=["_1", "_2"], left_on="Project_id_initial", right_on='Project_id_final')

This will append the suffixes "_1" and "_2" to the overlapping columns. Because both datasets have the same column name Id, the Id column will appear to be Id_1 in the left dataset and Id_2 in the right dataset.

Joining Datasets Using join()

The join() method works on the DataFrame object and joins the columns based on the index values. Let's perform a basic join operation on the dataset.

dt1.join(dt2, lsuffix="_1", rsuffix="_2")

The columns of the dt2 dataset will be joined with the dt1 dataset based on the index values.

Since the index values of both datasets are the same which is 0, 1, 2, and 3, that's why we got all the rows.

The join() method's parameters can be used to manipulate the dataset. The join() method, like the merge() function, includes how and on parameters.

how: Default value is left join. It is the same as the how parameter of the merge() function, but the difference is that it performs index-based joins.
on: A column or index name is required to join on the index in the specified dataset.
lsuffix and rsuffix: Used to append the suffix to the left and right datasets' overlapping columns.

Examples

Left join on an index

dt1.join(dt2.set_index("Id"), on="Id", how="left")#-------------------------OR----------------------#dt1.join(dt2.set_index("Id"), on="Id")

In the above code, we use set_index('Id') to set the Id column of the dt2 dataset as the index and perform a left join (how="left") on the Id column (on="Id") between dt1 and dt2.

This will join matching values in the Id column of the dt2 dataset with the Id column of the dt1 dataset. If any values are missing, they will be filled in by NaN.

It's the same as when we used the merge() function, but this time we're joining based on the index.

Right join on an index

dt1.join(dt2.set_index("Id"), on="Id", how="right")

We are joining the dt1 dataset with the index of the dt2 dataset based on the Id column. We got NaN in the first five columns for A6 because there were no values specified in the dt1 dataset.

Inner join on an index

dt1.join(dt2.set_index("Id"), on="Id", how="inner")

The datasets were joined based on matching index values, i.e., both datasets dt1 and dt2 share A1, A2, and A3, so the values corresponding to these indices were joined.

Outer join on an index

dt1.join(dt2.set_index("Id"), on="Id", how="outer")

We performed the outer join, which included all of the rows from both datasets based on the Id. The corresponding values have been filled in, and missing values have been filled in with NaN.

Cross Join

dt1.join(dt2, how="cross", lsuffix="_1", rsuffix="_2")

We didn't pass the on parameter, instead, we defined how the data should join (how="cross"). The resulting dataset will be the product of both datasets' row counts.

Conclusion

We've learned how to use pandas.concat(), pandas.merge(), and pandas.DataFrame.join() to combine, merge, and join DataFrames.

The concat() function in pandas is a go-to option for combining the DataFrames due to its simplicity. However, if we want more control over how the data is joined and on which column in the DataFrame, the merge() function is a good choice. If we want to join data based on the index, we should use the join() method.

🏆Other articles you might be interested in if you liked this one

How to use assert statements for debugging in Python?

How to write unit tests using the unittest module in Python?

What are the uses of asterisk(*) in Python?

What are the init and new methods in Python?

How to build a custom deep learning model using Python?

How to generate temporary files and directories using tempfile in Python?

How to run the Flask app from the terminal?

That's all for now

Keep coding

Understanding Unit Testing in Python with the unittest Module

Sachin Pal — Thu, 29 Jun 2023 17:09:35 GMT

Unit testing is a crucial part of software development, ensuring that functions and tasks within code work as intended. By dividing code into small units and testing them independently, developers can catch errors early and simplify the debugging process.

This article explores how to implement unit testing in Python using the unittest module.

Getting Started With unittest

The unittest module includes a number of methods and classes for creating and running test cases. Let's look at a simple example where we used the unittest module to create a test case.

# basic.pyimport unittestclass TestSample(unittest.TestCase):    def test_equal(self):        self.assertEqual(round(3.155), 3.0)    def test_search(self):        self.assertIn("G", "Geek")

First, we imported the unittest module, which will enable us to use the classes that will be used to write and execute test cases.

The TestSample class is defined that inherits from the unittest.TestCase which will allow us to use the various assertion methods within our test cases.

We defined two test methods within our TestSample class: test_equal and test_search.

The test method test_equal() tests if round(3.155) is equal to 3.0 using the assertEqual() assertion method.

The test method test_search() tests if the character "G" is present in the string "Geek" using the assertIn() assertion method.

To run these tests, we need to execute the following command in the terminal.

python -m unittest basic.py

This command will launch unittest as a module that searches for and executes the tests in the basic.py file.

Note: The unittest module only discovers and executes those methods that start with test_ or test.

By the way, these dots represent a successful test.

We can use the unittest.main() function and put it in the following form at the end of the test script to load and run the tests from the module.

if __name__ == '__main__':    unittest.main()

This will allow us to run our test file, basic.py in this case, as the main module.

More Detailed Result

We can use the -v flag in the terminal or pass an argument verbosity=2 inside the unittest.main() function to get a detailed output of the test.

Commonly Used Assertion Methods

Here is the list of the most commonly used assertion methods in unit testing.

Method	Checks that
`assertEqual(a, b)`	`a == b`
`assertNotEqual(a, b)`	`a != b`
`assertTrue(x)`	`bool(x) is True`
`assertFalse(x)`	`bool(x) is False`
`assertIs(a, b)`	`a is b`
`assertIsNot(a, b)`	`a is not b`
`assertIsNone(x)`	`x is None`
`assertIsNotNone(x)`	`x is not None`
`assertIn(a, b)`	`a in b`
`assertNotIn(a, b)`	`a not in b`
`assertIsInstance(a, b)`	`isinstance(a, b)`
`assertNotIsInstance(a, b)`	`not isinstance(a, b)`

Example

Assume we have some code and want to perform unit testing on it using the unittest module.

# triangle.pyclass Triangle:    def __init__(self, base, height):        self.base = base        self.height = height    def area(self):        return 0.5 * self.base * self.height    def perimeter(self, side):        return self.base + self.height + side

The code defines a class called Triangle which has an init method that initializes the object with the instance variables self.base and self.height.

There are two more methods in the Triangle class: area() and perimeter().

The area() method returns the area of the triangle, which is half the product of the base and height (0.5 * self.base * self.height).

The method parameter() accepts a parameter called side, and because the triangle's parameter is the sum of its three sides, the base and height variables take the place of the other two sides.

Now we can create another Python file in which we'll write some tests and then execute them.

# test_sample.pyfrom triangle import Triangleimport unittestclass TestTriangle(unittest.TestCase):    t = Triangle(9, 8)    def test_area(self):        self.assertEqual(self.t.area(), 36)    def test_perimeter(self):        self.assertEqual(self.t.perimeter(5), 22)    def test_valid_base(self):        self.assertGreater(self.t.base, 0)    def test_valid_height(self):        self.assertGreater(self.t.height, 0)if __name__ == '__main__':    unittest.main(verbosity=2)

The above code imports the Triangle class from the triangle module (triangle.py file) as well as imports the unittest module to write test cases.

The TestTriangle class inherits from the unittest.TestCase which has four test methods. The Triangle class was instantiated with a base of 9 and height of 8 and stored inside the variable t.

The test_area method tests whether self.t.area() is equal to the expected result 36 using the assertEqual() assertion.

The test_perimeter method tests whether self.t.perimeter(5) is equal to 22 using the assertEqual() assertion.

The test_valid_base and test_valid_height methods are defined to test if the base (self.t.base) and height (self.t.height) of the triangle are greater than 0 using the assertGreater() assertion.

The unittest.main(verbosity=2) method retrieves and executes the tests from the TestTriangle class. We'll get a detailed output because we used the verbosity=2 argument.

Test for Exception

If you've used assert statements before, you'll know that when one fails, it throws an AssertionError. Similarly, whenever a test method fails, an AssertionError is raised.

We can predetermine the conditions under which our code will generate an error, and then test those conditions to see if they generate errors. This is possible with the assertRaises() method.

The assertRaises() method can be used with context manager so we'll use it in the following form:

def test_method(self):    with assertRaises(exception_name):        function_name(argument)

Consider the following function gen_odd(), which generates a series of odd numbers up to the argument n by incrementing the num by 3 and contains only a few checks, where the argument n must be of type int and greater than 0.

# odd.pydef gen_odd(n):    if type(n) != int:        raise TypeError("Invalid argument type.")    if n < 0:        raise ValueError("Value must be greater than 0.")    num = 0    while num <= n:        if num % 2 == 1:            print(num)        num += 3

Now we'll write test methods to simulate conditions that could cause the above code to fail.

from odd import gen_oddimport unittestclass OddTestCase(unittest.TestCase):    def test_negative_val(self):        with self.assertRaises(ValueError):            gen_odd(-5)    def test_float_val(self):        with self.assertRaises(TypeError):            gen_odd(99.9)    def test_string_val(self):        with self.assertRaises(TypeError):            gen_odd('10')if __name__ == '__main__':    unittest.main(verbosity=2)

We wrote three test methods in the OddTestCase class to ensure that when invalid arguments are passed, the corresponding error is raised.

The test_negative_val() method asserts that ValueError is raised when gen_odd(-5) is called.

Similarly, the test_float_val() and test_string_val() methods assert that when gen_odd(99.9) and gen_odd('10') are called, respectively, TypeError is raised.

All three tests in the above code passed, which means they all raised corresponding errors, otherwise, the tests would have failed or raised the errors if another exception was raised. Let's put it to the test.

from odd import gen_oddimport unittestclass OddTestCase(unittest.TestCase):    def test_valid_arg(self):        with self.assertRaises(TypeError, msg="Valid argument"):            gen_odd(10)if __name__ == '__main__':    unittest.main(verbosity=2)

The above condition within the test_valid_arg() method will not throw a TypeError because gen_odd() function is passed with a valid argument.

The above test method failed and raised an AssertionError with the message TypeError not raised : Valid argument.

Skipping Tests

The unittest makes use of the skip() decorator or skipTest() to skip any test method or whole test class on purpose, and we are required to specify the reason why the test is being skipped.

Consider the previous example's code, which we modified by adding the skip() decorator.

from odd import gen_oddimport unittestclass OddTestCase(unittest.TestCase):    @unittest.skip("Valid argument")    def test_valid_arg(self):        with self.assertRaises(TypeError):            gen_odd(10)if __name__ == '__main__':    unittest.main(verbosity=2)

It's clear that the above condition will fail and throw an AssertionError, so we skipped the testing on purpose.

What if we wanted to skip the test if a particular condition was true? We can accomplish this by using the skipIf() decorator, which allows us to specify a condition and skip the test if it is true.

from odd import gen_oddimport sysimport unittestclass OddTestCase(unittest.TestCase):    @unittest.skipIf(sys.getsizeof(gen_odd(10)) > 10, "Exceeded limit")    def test_memory_use(self):        self.assertTrue(sys.getsizeof(gen_odd(10)) > 10)        print(f"Size: {sys.getsizeof(gen_odd(10))} bytes")if __name__ == '__main__':    unittest.main(verbosity=2)

The condition in the above skipIf() decorator checks whether the size of gen_odd(10) is greater than 10 bytes, if the condition is true, the test method test_memory_use() is skipped, otherwise, the test is executed.

Expected Failure

If we have a test method or test class with conditions that are expected to be false, we can use the expectedFailure() decorator to mark them as expected failures instead of checking for errors.

from odd import gen_oddimport sysimport unittestclass OddTestCase(unittest.TestCase):    @unittest.expectedFailure    def test_memory_use(self):        self.assertTrue(sys.getsizeof(gen_odd(10)) < 10, msg="Expected to be failed")        print(f"Size: {sys.getsizeof(gen_odd(10))} bytes")if __name__ == '__main__':    unittest.main(verbosity=2)

We've modified the previous code and the condition we are checking inside the test_memory_use() method is expected to be false, which is why the method is decorated with the @unittest.expectedFailure decorator.

Conclusion

We can use the unittest module to write and run tests to ensure that the code is working properly. The test can result in one of three outcomes: OK, FAIL, or Error.

The unittest module provides several assertion methods that are used to validate the code.

Let's recall, what we've learned:

the basic usage of unittest module.
CLI commands to run the tests.
testing if the condition is raising an exception.
skipping the tests on purpose and when a certain condition is true.
marking a test as an expected failure.

🏆Other articles you might be interested in if you liked this one

How to use assert statements for debugging in Python?

What are the uses of asterisk(*) in Python?

What are __init__ and __new__ methods in Python?

How to implement getitem, setitem and delitem in Python classes?

How to change the string representation of the object using str and repr methods?

How to generate temporary files and directories using tempfile in Python?

Build a custom deep learning model using the transfer learning technique.

That's all for now

Keep coding

Advanced Python Coroutines: Best Practices for Efficient Asynchronous Programming

Sachin Pal — Fri, 23 Jun 2023 15:30:39 GMT

You must have used the functions in the Python program to perform a certain task. These functions in the Python program are known as subroutines. Subroutines follow a set pattern of execution like entering at one point(subroutine or function call) and exiting at another point(return statement).

Coroutines are similar to subroutines but unlike subroutines, coroutines can enter, exit, and resume at different points during the execution. Coroutines are used for cooperative multitasking which means the control is passed from one task to another to enable multiple tasks to run simultaneously.

Introduction To Coroutines

Coroutines (generator-based coroutines) are a specialized version of generators and like them, they can be paused and resumed using the yield keyword at the time of execution.

Generators generate data, whereas coroutines can do both, generating and consuming data, with a slight difference in how the yield is used within coroutines. We can use yield as an expression (value = yield) within coroutines, which means that yield can both generate and consume values.

To justify the above point, consider the following example in which we created a function that exhibits the behavior of a coroutine.

def cor_func(char):    print(f"Searching for character: {char}")    while True:        data = yield        if char in data:            print("True")        else:            print("False")value = cor_func("e")

The above code defines a coroutine function called cor_func, which searches for a parameter char. The function cor_func uses while True to run an infinite loop, and within the loop, yield is encountered, which is a halt in the execution and allows us to send data in the meantime. The caller's data is saved inside the variable data.

If the character char is present in the data, the function prints the message True, otherwise, the message False is printed.

We created the instance of the coroutine (cor_func("e")) and stored it inside the variable value.

value.__next__()value.send("hello")value.send("GeekPython")value.send("Geek")

The coroutine is started by calling value.__next__(). The coroutine function will execute until it reaches the yield, allowing us to send data.

We first sent the string "hello" using the value.send("hello") and the string will be checked if the character "e" is present in it, since there is "e" in "hello", the output would be True. The same process will be repeated for value.send("GeekPython") and value.send("Geek") as well.

Searching for character: eTrueTrueTrue

Closing the Coroutine

The close() method, as the name implies, is used to close the coroutine, which means that no more values can be sent to the coroutine.

Consider the previous code, which we modified by adding the value.close() method.

def cor_func(char):    print(f"Searching for character: {char}")    while True:        data = yield        if char in data:            print("True")        else:            print("False")value = cor_func("e")value.__next__()value.send("hello")value.send("GeekPython")value.close()value.send("Geek")

We called the close() method on the coroutine, which will close the coroutine and prevent it from receiving further values. However, when we tried to send the string "Geek" to the coroutine, we got the following result.

Searching for character: eTrueTrueTraceback (most recent call last):  ....    value.send("Geek")StopIteration

Since the coroutine had already been closed, the send() method that followed the close() method threw the StopIteration exception.

Async Coroutine

We can define a coroutine function (async def) and pause the process until a specific task is completed by using the (async/await) keywords.

https://geekpython.in/asyncio-how-to-use-asyncawait-in-python

import asyncioasync def coroutine_func():    print("Coroutine started.")    await asyncio.sleep(1)    print("Coroutine ended.")cor = coroutine_func()print(cor)----------0x0000023A863280C0>

The asyncio module was imported, which allows us to write asynchronous code. Then we defined the asynchronous coroutine function (coroutine_func()), which prints a message and then waits one second before printing another message using await asyncio.sleep(1).

We created a coroutine function instance and stored it in the variable cor. When we printed, the coroutine object was returned.

We can use the asyncio.run() function and pass in our coroutine object cor to run the above coroutine.

asyncio.run(cor)

Concurrency Using Coroutine

We can think of it as the ability to run multiple tasks concurrently in an overlapping manner. Let's understand with an example.

import asyncio# Task 1async def read_file():    with open("test.txt") as data:        await asyncio.sleep(1)        print(data.read())# Task 2async def write_file():    with open("test.txt", "a") as data:        await asyncio.sleep(1)        data.write("\nIt will be fun.")# Task 3async def message():    print("Hey,")    await asyncio.sleep(1)    print("Welcome aboard.")# Entry pointasync def main():    await asyncio.gather(message(), write_file(), read_file())if __name__ == "__main__":    import time    start = time.perf_counter()    asyncio.run(main())    elapsed = time.perf_counter() - start    print(f"Tasks executed in {elapsed:0.1f} seconds.")

Three coroutine functions are defined in the preceding code.

The coroutine function read_file() opens the file test.txt, waits for one second, reads the content, and then prints it.

The coroutine function write_file() opens the file in append mode, waits for one second, writes the data, and then appends to the file.

The message() coroutine function prints one message, then waits for one second before printing another.

The coroutine function main() is defined to run those three coroutine functions concurrently using asyncio.gather(message(), write_file(), read_file()).

Inside the if __name__ == "__main__": block, we executed our main() coroutine function using asyncio.run(main()) and we used the time.perf_counter() to measure the execution time.

Hey,Welcome aboard.We are currently learning coroutines.It will be fun.Tasks executed in 1.0 seconds.

The code took 1.0 seconds to execute all three coroutine functions which were due to 1 second delay in the coroutines. These coroutine functions were executed simultaneously.

Awaiting Coroutine

Coroutines are awaitables (objects that can be used in an await expression), so they can be awaited from other coroutines. Let's look at the example to get a grasp of it.

import asyncioasync def read_file():    with open("test.txt") as file:        return file.read()async def write_file():    with open("test.txt", "a+") as file:        file.write("\nLet's get started.")    # Awaiting read_file() from write_file() coroutine    print(await read_file())asyncio.run(write_file())

When we run the above code with asyncio.run(write_file()), the coroutine function write_file() is called first, and it opens the test.txt file in append mode and appends the data. Then it proceeds until it encounters the await read_file(), which halts the execution of the write_file() coroutine function.

The execution flow then proceeds to the read_file() coroutine function, which reads the contents of the test.txt file.

When the execution flow returns, the content returned by read_file() is printed.

We are currently learning coroutines.It will be fun.Let's get started.

Inside the test.txt file, we can see that the string "Let's get started" has been appended to our existing data.

Conclusion

Coroutines are very helpful in asynchronous programming in which multiple tasks run concurrently. We've seen how multiple coroutines are executed concurrently to save time.

Coroutines can enter, exit, and resume at different points during the execution. They are similar to generators but they have additional features such as support for cooperative multitasking, asynchronous programming, and more.

🏆Other articles you might be interested in if you liked this one

How to use assert statements for debugging in Python?

How to manipulate paths using pathlib module in Python?

Different types of inheritances in Python classes.

How to implement __getitem__, __setitem__ and __delitem__ in Python classes?

How to connect PostgreSQL with Python using psycopg2?

How to display images on the frontend using FastAPI?

Public, Protected, and Private access modifiers in Python.

That's all for now

Keep coding

Understanding assert For Debugging In Python

Sachin Pal — Mon, 19 Jun 2023 14:50:14 GMT

Python's assert statements are one of several options for debugging code in Python.

Python's assert is mainly used for debugging by allowing us to write sanity tests in our code. These tests are performed to ensure that a particular condition is True or False. If the condition is False, an AssertionError is raised, indicating that the test condition failed.

Understanding assert

Python's assert keyword is used to write assert statements that contain a condition or assumption that is tested against the condition from the program that we expect to be true.

If the condition matches the expected condition, nothing is displayed on the console and the execution continues, otherwise, an AssertionError is displayed. This exception interrupts program execution and indicates that the condition test failed.

Syntax

The syntax of the assert statement is written in the following form:

assert [condition], [error message]

condition - the condition or assumption to be tested

error message - the error message we want to display in the console when the condition is failed.

The assert In Action

Let's create some assert statements to perform code checks. Consider the following example, in which we are testing our program to see if it produces the expected results.

def evaluate_num(num):    if num > 5:        return num * num    else:        return num * 2val = evaluate_num(5)"""Assert statement to check that upper code returns 10 on evaluating evaluate_num(5)"""assert val == 10, "Condition failed." # We'll get nothing

The above code defines a function called evaluate_num that takes a parameter num. The function checks if the value of num is greater than 5. If it is, the function returns the square of num (num * num). Otherwise, if num is less than or equal to 5, the function returns num multiplied by 2 (num * 2).

The assert statement checks whether the variable val is equal to 10 after evaluating evaluate_num(5). In this case, evaluate_num(5) returns 10, which means that the assert statement is true and we'll get nothing in the console.

Let's see what happens when we pass a num greater than 5.

val = evaluate_num(6)"""Assert statement to check that upper code returns 12 on evaluating evaluate_num(6)"""assert val == 12, "Condition failed."

We called the evaluate_num with the argument 6. Since 6 is greater than 5, the function will square the number 6 (6 * 6) which makes the variable val equal to 36, which makes our assert statement false. As a result, we'll get an AssertionError with the message "Condition failed.".

Traceback (most recent call last):  ....    assert evaluate == 12, "Condition failed."AssertionError: Condition failed.

Controlling the Behavior of assert

We were able to write assertions in a single line by using the assert keyword, and this single-line assert statements are equivalent to the following expression:

if __debug__:    if not evaluate == 12:        raise AssertionError("Condition failed.")

The above if __debug__ conditional would function similarly to the assert statement written in the preceding code.

As a result of the above expression, the syntax of the if __debug__ conditional would be:

if __debug__:    if not condition:        raise AssertionError(error message)# ----------------- OR ----------------- ## For simple form assert statement without error messageif __debug__:    if not condition:        raise AssertionError

debug

What exactly is __debug__, and how does it affect the behavior of assert statements in a Python program?

The __debug__ is a built-in constant in Python that is set to True by default. However, we can change this to False by running Python in optimized mode with the -O command line option or by modifying the PYTHONOPTIMIZE variable.

print(__debug__)----------True

As we can see when we printed the __debug__, we got True which indicates that our Python is not running in optimized mode.

Let's understand better with examples.

# test.pyclass Shopping:    def __init__(self, product, price):        self.product = product        self.price = price    def list(self):        assert self.price > 0, "Price should not be 0 or negative."        data = f"{self.product} is worth ${self.price}."        return dataitem_1 = Shopping("Perfume", 250)print(item_1.list())item_2 = Shopping("Band Aid", 0.75)print(item_2.list())item_3 = Shopping("Denim", -35)print(item_3.list())

The above code defines the Shopping class, which has a __init__ method that takes two parameters, product and price. These parameters' values are assigned to the instance variables self.product and self.price.

This class has another method called list that returns product information along with a price. This method includes an assert statement that determines whether the product's price is greater than 0.

Then we created three instances of the Shopping class (item_1, item_2, and item_3) and passed in the various products and prices. When we run the above code, we get the following result.

Perfume is worth $250.Band Aid is worth $0.75.Traceback (most recent call last):  ....    assert self.price > 0, "Price should not be 0 or negative."AssertionError: Price should not be 0 or negative.

The first two instances passed the test because the assert statement condition (self.price > 0) was met. As a result, we received the string, whereas in the third case, the price was set to -35, which did not satisfy the assert statement condition, and we received the AssertionError with the error message.

The following if __debug__ conditional is equivalent to the assert statement we created in the method list within the class Shopping. If we had used the following code instead of the assert statement in the above code, the code would have worked perfectly.

if __debug__:    if not self.price > 0:        raise AssertionError("Price should not be 0 or negative.")# Equivalent toassert self.price > 0, "Price should not be 0 or negative."

Disabling Assertions

We can disable the assertion and prevent the AssertionError message from being displayed on the console. We'll try it manually first, then look at other safe options.

We could disable the assertion manually if we set __debug__ to False. Let's see if we can complete this task within our program.

# test.pyclass Shopping:    def __init__(self, product, price):        self.product = product        self.price = price    def list(self):        if __debug__ == False:            if not self.price > 0:                raise AssertionError("Price should not be 0 or negative.")        data = f"{self.product} is worth ${self.price}."        return dataitem_1 = Shopping("Perfume", -1)print(item_1.list())item_2 = Shopping("Band Aid", 0.75)print(item_2.list())item_3 = Shopping("Denim", -35)print(item_3.list())

Within our method list, we added a code snippet that checks the value of __debug__, if __debug__ is set to False, it means that Python is running in optimized mode and the assertions are disabled.

Since assertions are disabled, the above code will produce no errors on the console.

Perfume is worth $-1.Band Aid is worth $0.75.Denim is worth $-35.

Note: This is not a good practice and is not recommended method to disable assertions.

The -O Option

The -O flag is a command-line option that disables all assertions. Internally, this option sets the __debug__ constant to False.

D:\SACHIN\Pycharm\assert_in_python>python -O>>> print(__debug__)False

Open the terminal and change the directory containing the Python file and run the following command:

D:\SACHIN\Pycharm\assert_in_python>python -O test.pyPerfume is worth $-1.Band Aid is worth $0.75.Denim is worth $-35.

The python -O test.py command enables the optimized mode and executes the Python file test.py. The -O flag instructs the Python interpreter to optimize the code by turning off assertions.

We would have gotten the AssertionError if we hadn't used the -O flag.

D:\SACHIN\Pycharm\assert_in_python>python test.pyTraceback (most recent call last):  ....         assert self.price > 0, "Price should not be 0 or negative."AssertionError: Price should not be 0 or negative.

PYTHONOPTIMIZE Env Variable

By setting the PYTHONOPTIMIZE environment variable to 1, we can run Python in optimized mode.

To set PYTHONOPTIMIZE=1, enter the following command in the terminal. This command will automatically run Python in optimized mode.

D:\SACHIN\Pycharm\assert_in_python>set PYTHONOPTIMIZE=1D:\SACHIN\Pycharm\assert_in_python>python test.pyPerfume is worth $-1.Band Aid is worth $0.75.Denim is worth $-35.

When we check the status of the __debug__ constant in the Python shell, it is automatically set to False.

D:\SACHIN\Pycharm\assert_in_python>python>>> print(__debug__)False

To undo the optimized mode, use the command set PYTHONOPTIMIZE=0.

Performing Debugging

In this section, we'll write a bunch of assert statements and then test them with pytest, a third-party package. This package contains a simpler syntax for writing tests.

Since this is an external package, we must install it by running the command pip install pytest in the terminal.

Make a Python file called test_file.py and place the following code, which includes tests, inside it.

# test_file.pyimport mathfrom os.path import isdir# test_1def test_sq():    assert 5 * 5 == 20# test_2def test_search():    assert "Py" in "GeekPython"# test_3def test_dir():    assert isdir("test_file.py")# test_4def test_type():    assert type([1, 2, 3]) == listclass TestCondition:    # test_5    def test_reverse(self):        sequence = "GeekPython"        assert sequence[:: -1] == "nohtyPkeeG"    # test_6    def test_value(self):        assert round(math.pi) == 3.14

Now, open a terminal, navigate to the directory containing the Python file test_file.py, and type pytest test_file.py.

D:\SACHIN\Pycharm\assert_in_python\test>pytest test_file.py==================== test session starts ====================platform win32 -- Python 3.10.5, pytest-7.3.2, pluggy-1.0.0rootdir: D:\SACHIN\Pycharm\assert_in_python\testplugins: anyio-3.6.2collected 6 itemstest_file.py F.F..F                                     [100%]==================== FAILURES ====================____________________ test_sq ____________________     def test_sq():>       assert 5 * 5 == 20E       assert (5 * 5) == 20test_file.py:6: AssertionError____________________ test_dir ____________________     def test_dir():>       assert isdir("test_file.py")E       AssertionError: assert FalseE        +  where False = isdir('test_file.py')test_file.py:12: AssertionError____________________ TestCondition.test_value ____________________ self = 0x00000207D485FA60>    def test_value(self):>       assert round(math.pi) == 3.14E       assert 3 == 3.14E        +  where 3 = round(3.141592653589793)E        +    where 3.141592653589793 = math.pitest_file.py:23: AssertionError==================== short test summary info ==================== FAILED test_file.py::test_sq - assert (5 * 5) == 20FAILED test_file.py::test_dir - AssertionError: assert FalseFAILED test_file.py::TestCondition::test_value - assert 3 == 3.14==================== 3 failed, 3 passed in 0.32s ====================

The output of our tests produced by pytest is shown above, and we can see that three of them failed and three passed. The output provided full details for the three failed tests.

Note: pytest collects tests based on a naming convention. By default, classes containing tests must begin with Test, and any function in a file that should be treated as a test must also begin with test_. pytest will run all files of the form test_*.py or *_test.py in the current directory and its subdirectories. More details on the naming convention.

Conclusion

assert is a built-in keyword in Python that is used to create assert statements that perform sanity checks in our code. It is used for testing and debugging.

The assert statement includes a condition that is used to determine whether the condition is True or False. If the condition is False, an AssertionError is thrown, indicating that the condition was not met.

Let's recall what we've learned:

What is an assert statement with an example?
__debug__ constant in Python.
Controlling the behavior of assert statements.
Disabling assertions using the -O option and PYTHONOPTIMIZE environment variable.
Debugging code using the pytest package.

🏆Other articles you might be interested in if you liked this one

Python generators and the yield keyword - how they work?

Performing high-level path operations using pathlib in Python.

Understanding the different uses of asterisk(*) in Python.

What is the difference between seek() and tell() in Python?

Generate temporary files and directories using tempfile module in Python.

How to change the string representation of the objects in Python?

__init__ and __new__: The concept of initializer and constructor in Python.

What are context managers in Python?

That's all for now

Keep Coding

Python Generators and the Yield Keyword - How They Work

Sachin Pal — Thu, 15 Jun 2023 15:43:07 GMT

Generators are defined by a function that generates values on the fly and can be iterated in the same way that we iterate strings, lists, and tuples in Python using the "for" loop.

When the body of a normal function contains the yield keyword instead of the return keyword, it is said to be a generator function.

In this article, we'll look at:

What are generator and generator functions?
Why do we need them?
What does the yield statement do?
Generator expression

Generator

PEP 255 introduced the generator concept, as well as the yield statement, which is used in the generator function.

When called, the generator function returns a generator object or generator-iterator object, which we can loop over in the same way that we do with lists.

# Generator function to generate odd numbersdef gen_odd(num):    n = 0    while n <= num:        if n % 2 == 1:            yield n        n += 1odd_num = gen_odd(10)for i in odd_num:    print(i)

The above code defines the gen_odd generator function, which accepts an arbitrary number and returns a generator object that can be iterated using either the "for" loop or the next() method.

By iterating over the generator-iterator object, we obtained the odd numbers between 0 and 10.

Why Generators?

We now have a general understanding of generators, but why do we use them? The great thing about generator functions is that they return iterators, and iterators use a strategy known as lazy evaluation, which means that they return values only when requested.

Consider a scenario in which we need to compute very large amounts of data. In that case, we can use generators to help us because generators compute the data on demand, eliminating the need to save data in memory.

yield - What It Does?

The yield statement is what gives generators their allure, but what does it do within the function body? Let's look at an example to see how the process works.

def gen_seq(num):    initial_val = 1    while initial_val <= num:        yield initial_val        initial_val += 1sequence = gen_seq(3)

The generator function gen_seq generates a sequence of numbers up to the specified num argument. The generator function was called, and it will return a generator object.

print(sequence)----------0x000001FFBDB55770>

We can now use the generator object's next() method. To get the values, use sequence.__next__() or next(sequence).

print(sequence.__next__())print(sequence.__next__())print(sequence.__next__())----------123

When we call the generator object's next() method, the code inside the function executes until it reaches the yield statement.

What happens when the function code encounters the yield statement? The yield statement operates differently than the return statement.

The yield statement returns the value to the next() method's caller and instead of exiting the program, it retains the function's state. When we call the next() method again, the execution resumes where it was left.

Check the code below to gain a better understanding.

def gen_seq():    print("Start")    val = 0    print("Level 1")    while True:        print("Level 2")        yield val        print("Level 3")        val += 1sequence = gen_seq()print(sequence.__next__())print(sequence.__next__())print(sequence.__next__())

The print statement is set at every level in the above code to determine whether or not the yield statement continues execution from where it was left.

StartLevel 1Level 20Level 3Level 21Level 3Level 22

The code begins at the beginning and progresses through levels 1 and 2 before returning the yielded value to the next() method's caller. When we call the next() method again, the previously yielded value increments by 1, and the execution cycle is resumed from where it was left.

Exception

The generators, like all iterators, can become exhausted after all the iterable values are evaluated. Consider the generator function gen_odd from earlier.

# Generator function to generate odd numbersdef gen_odd(num):    n = 0    while n <= num:        if n % 2 == 1:            yield n        n += 1odd_num = gen_odd(3)print(odd_num.__next__())print(odd_num.__next__())print(odd_num.__next__())

The above code will generate odd numbers up to 3. As a result, the program will only generate 1 and 3, allowing us to call the next() method twice. When we run the above code, we will get the following result.

13Traceback (most recent call last):  ....StopIteration

When we called the first two next() methods on odd_num, we got the yielded values, but when we called the last next() method, our code threw a StopIteration exception which indicates that the iterator has ended.

Instead of raising the StopIteration exception, the program would have simply exited if we had used the "for" loop.

yield In try/finally

Take a look at the example below in which we have a generator function and try/except/finally clause inside it.

def func():    try:        yield 0        try:            yield 1        except:            raise ValueError    except: # program never get to this part        yield 2        yield 3    finally:        yield 4x = func()for val in x:    print(val)

If we run the above code, we'll get the following output:

We can see that we didn't get the values 2 and 3 as our program didn't reach that part because of the ValueError, and as is customary when an error occurs, the program proceeds to the finally clause, execute it, and exit the program.

Generator Expression

You must have used the list comprehension, The generator expression allows us to create a generator in a few lines of code. Unlike list comprehensions, the generator expressions are enclosed within parenthesis ().

gen_odd_exp = (n for n in range(5) if n % 2 == 1)print(gen_odd_exp)---------- at 0x000001D635E33C30>

The above generator expression gen_odd_exp is somewhat equivalent to the generator function gen_odd which we saw at the beginning. We can iterate just like we would with a generator function.

print(next(gen_odd_exp))print(next(gen_odd_exp))----------13

When we compare the memory requirements of the generator expression and list comprehension, we get the following result.

import sysgen_odd_exp = (n for n in range(10000) if n % 2 == 1)print(f'{sys.getsizeof(gen_odd_exp)} bytes')list_odd_exp = [n for n in range(10000) if n % 2 == 1]print(f'{sys.getsizeof(list_odd_exp)} bytes')----------104 bytes41880 bytes

The generator object in the case of generator expression took 104 bytes of memory, whereas the result in the case of list comprehension took 41880 bytes (almost 41 KB) of memory.

Conclusion

A normal function with the yield keyword in its body defines the generator. This generator function returns a generator-iterator object that can be iterated over to generate a data sequence.

It is said that generators are a Pythonic way to create iterators, and iterators use a strategy known as lazy evaluation, which means that they only return values when the caller requests them.

Generators come in handy when we need to compute large amounts of data without storing them in memory.

The quickest way to create a generator function in a few lines of code is to use a generator expression or generator comprehension (similar to list comprehension).

🏆Other articles you might be interested in if you liked this one

What are context manager and the with statement in Python?

What are __init__ and __new__ methods in Python?

What are __init__ and __call__ methods in Python?

What is the difference between seek() and tell() in Python?

Generate temporary files and directories using tempfile module in Python.

How to display images on the frontend using FastAPI in Python?

How to use match case statements for pattern matching in Python?

Build a command line interface using argparse in Python.

That's all for now

Keep Coding

High-level Path Operations Using pathlib Module In Python

Sachin Pal — Sat, 10 Jun 2023 14:20:36 GMT

The pathlib module is a part of Python's standard library and allows us to interact with filesystem paths and work with files using various methods and properties on the Path object.

Getting Started With pathlib

The most frequently used class of the pathlib module is Path. It is better to kick off with the Path class if we are using this module for the first time or not sure which class to use for our task.

# Importing Path class from pathlibfrom pathlib import Path# Instantiating the Pathpath = Path(__file__)print(path)

In the above example, first, we imported the Path class from the pathlib module and then instantiated the Path with __file__.

This returns the absolute path to the current file, main.py, on which we are working.

D:\SACHIN\Pycharm\pathlib_module\main.py

The Path class instantiates the file's concrete path for the operating system on which the user is working. Because we're using Windows, we'll get the following output if we print the type of path.

print(type(path))----------<class 'pathlib.WindowsPath'>

Before we get into the methods and properties of Path, it's important to understand that the Path classes are divided into pure paths and concrete paths.

Pure Paths

Pure paths enable us to manipulate the file paths of another operating system, such as manipulating the Windows path on a Unix machine or vice versa without accessing the operating system.

Pure paths only support computational operations and do not support I/O operations such as reading, writing, or manipulating files.

Pathlib's PurePath

PurePath is a class that is used to perform various operations on the path object. Consider the example below, in which we instantiate the PurePath() class.

# Importing PurePath class from pathlibfrom pathlib import PurePathpath = PurePath('main.py')print(path)print(type(path))----------main.py<class 'pathlib.PureWindowsPath'>

We got the PureWindowsPath() path when we ran the above code because we are on a Windows machine, if we were on a non-Windows machine, we would get the PurePosixPath() path.

The PurePath() has two subclasses, which are as follows:

PureWindowsPath()
PurePosixPath()

PureWindowsPath

This subclass is implemented for Windows filesystem paths, as the name suggests.

# Importing PureWindowsPath class from pathlibfrom pathlib import PureWindowsPath# Instantiating PureWindowsPathpath = PureWindowsPath('main.py')print(path)print(type(path))----------main.py<class 'pathlib.PureWindowsPath'>

PurePosixPath

This subclass is used for non-Windows filesystem paths.

# Importing PurePosixPath class from pathlibfrom pathlib import PurePosixPath# Instantiating PurePosixPathpath = PurePosixPath('main.py')print(path)print(type(path))----------main.py<class 'pathlib.PurePosixPath'>

PurePath Methods And Properties

PurePath provides several methods that allow us to perform various operations on filesystem paths.

Getting the drive name

The PurePath.drive can be used to extract the drive name from the specified path. We'll get a string representing the drive name, or an empty string if no drives are present in the path.

from pathlib import PurePath# Path having a drive namedrive = PurePath('D:/SACHIN/Pycharm/test.py').driveprint(drive)# Path without a drive nameno_drive = PurePath('/SACHIN/Pycharm/test.py').driveprint(no_drive)----------D:

The first part of the code has a drive name in its path, which we got in the output, but the second part of the code did not, so we got an empty string.

Getting the root and stem

The root is the file path's top-level directory, which we can access with PurePath.root, and the stem is the last component of the file path without the suffix, which we can access with PurePath.stem.

from pathlib import PureWindowsPath, PurePosixPath# Getting the rootroot = PureWindowsPath('D:/SACHIN/Pycharm/').rootprint(root)# Getting the root from the Unix-like pathu_root = PurePosixPath('/SACHIN/Pycharm/').rootprint(u_root)# Getting the stemstem = PureWindowsPath('D:/SACHIN/Pycharm/test.py').stemprint(stem)----------\/test

Getting the ancestors of the path

The PurePath.parents can be used to access the logical ancestors of the path.

from pathlib import PureWindowsPathancestor = PureWindowsPath('D:/SACHIN/Pycharm/test.py')print(ancestor.parents[0])print(ancestor.parents[1])print(ancestor.parents[-1])----------D:\SACHIN\PycharmD:\SACHIND:\

Using the slicing technique, we were able to access the path's ancestors. Python 3.10 added support for slices and negative index values for PurePath.parents.

We got the full path except for the file name when we used 0, one directory back when we used 1, and the beginning portion of the path when we used -1.

Getting the parent

The PurePath.parent allows us to access the logical parent of the path.

from pathlib import PureWindowsPathp = PureWindowsPath('D:/SACHIN/Pycharm/test.py')print(p.parent)----------D:\SACHIN\Pycharm

In the above example, the parent directory of test.py is Pycharm/, the parent directory of Pycharm/ is SACHIN/, and the parent directory of SACHIN/ is the drive D:/, which contains all of these directories and files.

That's why we got this output D:\SACHIN\Pycharm.

Getting the name and suffix

PurePath.name provides access to the name of the path's final component, while PurePath.suffix provides access to the file extension of the final component. If the file has multiple extensions, we can get the list of file extensions with PurePath.suffixes.

from pathlib import PureWindowsPath# Accessing the name of the final component of the pathfile_name = PureWindowsPath('D:/SACHIN/Pycharm/test.py').nameprint(file_name)# Accessing the suffix of the final component of the pathfile_suffix = PureWindowsPath('D:/SACHIN/Pycharm/test.py').suffixprint(file_suffix)----------test.py.py

The last component of the path is test.py, and the extension is .py, which is what we got in the output.

What if our test.py file has extensions like test.py.zip? If we want to extract both extensions, we can use PurePath.suffixes.

# Accessing the multiple suffixfile_suffixes = PureWindowsPath('test.py.zip').suffixesprint(file_suffixes)----------['.py', '.zip']

Check if a path is absolute

The absolute path is one that has both a root and a drive(if the naming convention allows), and we can use the PurePath.is_absolute() method to determine whether or not a path is absolute. Returns a boolean value.

from pathlib import PureWindowsPath, PurePosixPathprint(PureWindowsPath('D:/SACHIN/').is_absolute())Trueprint(PureWindowsPath('/SACHIN/').is_absolute())Falseprint(PurePosixPath('/SACHIN/').is_absolute())Trueprint(PurePosixPath('D:/SACHIN/').is_absolute())False

Looking at the first two PureWindowsPath cases, we first get True because the path has both a drive and a root, but then we get False because the path lacks a drive.

In the PurePosixPath cases, we first got True even though the path did not have a drive name because non-Windows paths do not include drive names like Windows paths. But when we used the drive name in the path, we got False.

Combining paths

PurePath.joinpath() allows us to concatenate the path with the argument passed to it.

from pathlib import PurePath, PureWindowsPath, PurePosixPathprint(PurePath('D:/SACHIN/').joinpath('test.txt'))D:\SACHIN\test.txtprint(PurePath('D:/SACHIN/').joinpath(PureWindowsPath('test_dir', 'test.txt')))D:\SACHIN\test_dir\test.txtprint(PurePosixPath('/SACHIN/').joinpath(PurePosixPath('test_dir', 'test.txt')))/SACHIN/test_dir/test.txt

Matching the path

PurePath.match() takes a pattern and matches the path against the provided pattern(glob style pattern). When the path is matched, it returns True, otherwise, it returns False.

from pathlib import PureWindowsPathp = PureWindowsPath('D:/SACHIN/test.txt').match('*.py')print(p)p1 = PureWindowsPath('D:/SACHIN/test.py').match('*.py')print(p1)p2 = PureWindowsPath('D:/SACHIN/test/test.py').match('test/*.py')print(p2)----------FalseTrueTrue

Depending on the platform we're working on, pattern matching can be case-sensitive.

from pathlib import PureWindowsPath, PurePosixPathprint(PurePosixPath('/test/test.py').match('*.Py'))print(PureWindowsPath('/test/test.py').match('*.Py'))----------FalseTrue

Changing the name

PurePath.with_name() accepts a name argument and returns the new path with the changed file name.

from pathlib import PureWindowsPathchg_name = PureWindowsPath('D:/SACHIN/test.txt').with_name('test.py')print(chg_name)----------D:\SACHIN\test.py

If there is no name in the path, then we'll get a ValueError.

no_name = PureWindowsPath('D:/').with_name('test.py')print(no_name)----------Traceback (most recent call last):  ....    raise ValueError("%r has an empty name" % (self,))ValueError: PureWindowsPath('D:/') has an empty name

Changing the stem

The PurePath.with_stem() method creates a new path with a different stem.

from pathlib import PureWindowsPathchg_stem = PureWindowsPath('D:/SACHIN/example.py').with_stem('test')print(chg_stem)----------D:\SACHIN\test.py

The ValueError is thrown if the path does not have a name.

no_name = PureWindowsPath('D:/').with_stem('test')print(no_name)----------Traceback (most recent call last):  ....    raise ValueError("%r has an empty name" % (self,))ValueError: PureWindowsPath('D:/') has an empty name

Changing the suffix

We can change the suffix using PurePath.with_suffix(). If the file name lacks a suffix, the provided suffix will be appended.

from pathlib import PureWindowsPathsuf = PureWindowsPath('D:/SACHIN/test.py').with_suffix('.txt')print(suf)no_suf = PureWindowsPath('D:/SACHIN/test').with_suffix('.py')print(no_suf)----------D:\SACHIN\test.txtD:\SACHIN\test.py

What happens if we supply an empty string? The file's suffix will be removed.

empty_suf = PureWindowsPath('D:/SACHIN/test.py').with_suffix('')print(empty_suf)----------D:\SACHIN\test

Concrete Paths

Concrete paths perform computational operations in addition to I/O operations on filesystem paths. Unlike pure paths, we could use concrete paths to perform operations such as reading the file, writing data to the file, and even interacting with the files.

We can make system calls on path objects thanks to concrete paths. Concrete paths are subclasses of pure path classes, and there are three ways to instantiate concrete paths:

Path()
WindowsPath()
PosixPath()

Pathlib's Path

At the beginning of the article, we saw a glimpse of the Path class, which is a subclass of the PurePath class that represents the concrete path of the filesystem path.

When we instantiate the Path() class, it generates either PosixPath or WindowsPath object, depending on the machine we're working on.

# Importing Path class from pathlibfrom pathlib import Path# Instantiating the Pathpath = Path('D:/SACHIN/Pycharm')print(path)print(type(path))----------D:\SACHIN\Pycharm<class 'pathlib.WindowsPath'>

The Path() created a concrete Windows path because we're on a Windows machine.

PosixPath

PosixPath is a subclass of PurePosixPath and Path class that represents concrete non-Windows filesystem paths.

Because PosixPath will make system calls, we can't instantiate it on our machine because it's running on Windows.

# Importing PosixPath class from pathlibfrom pathlib import PosixPath# Instantiating PosixPathpath = PosixPath('main.py')print(path)----------Traceback (most recent call last):  ....    raise NotImplementedError("cannot instantiate %r on your system"NotImplementedError: cannot instantiate 'PosixPath' on your system

We can only instantiate the class that corresponds to our system, for example, we can instantiate the WindowsPath class on Windows machines and the PosixPath class on POSIX-compliant machines.

WindowsPath

WindowsPath is a subclass of PureWindowsPath and Path class that represents concrete Windows filesystem paths.

# Importing WindowsPath class from pathlibfrom pathlib import WindowsPath# Instantiating WindowsPathpath = WindowsPath('main.py')print(path)print(type(path))----------main.py<class 'pathlib.WindowsPath'>

Path Methods

The Path class provides several methods for performing I/O operations on filesystem paths by interacting with the operating system.

Getting the current working directory and home directory

You may have used os.getcwd() to get the current working directory, Path.cwd() does the same thing, returning the new path object of the current working directory.

# Importing Path class from pathlibfrom pathlib import Path# Getting the current working directorypath = Path.cwd()print(path)----------D:\SACHIN\Pycharm\pathlib_module

We obtained the path to our current working file, and we can see that the path separator is a backslash(\) because we are using the Windows operating system.

Path.home() returns the path to the user's home directory. If the home directory cannot be resolved, a RuntimeError is thrown.

# Importing Path class from pathlibfrom pathlib import Path# Getting the home directorypath = Path.home()print(path)----------C:\Users\SACHIN

Accessing the components of the path

We've seen the PurePath properties that help us access the path's components, since, Path is a subclass of PurePath, we can use those properties with the Path class as well.

# Importing Path class from pathlibfrom pathlib import Path# Instantiating the pathpath = Path('D:/SACHIN/test.py')# Accessing the drive nameprint(path.drive)D:# Accessing the rootprint(path.root)\# Accessing the nameprint(path.name)test.py# Accessing the stemprint(path.stem)test# Accessing the suffixprint(path.suffix).py# Accesing the parentprint(path.parent)D:\SACHIN

Iterating the directories

Using Path.iterdir(), we can get the path objects of the contents of the specified directory.

from pathlib import Pathpath = Path('D:/SACHIN/Pycharm/pathlib_module')# Iterating the pathlib_module directoryfor files in path.iterdir():    print(files)----------D:\SACHIN\Pycharm\pathlib_module\.ideaD:\SACHIN\Pycharm\pathlib_module\filesD:\SACHIN\Pycharm\pathlib_module\main.pyD:\SACHIN\Pycharm\pathlib_module\test.py

The path in the above code points to the pathlib_module directory, and we obtained the path objects of the directories and files contained within pathlib_module.

Here is another example of the .iterdir() method.

path = Path('files')for files in path.iterdir():    print(files)----------files\example.mdfiles\file.pyfiles\test.txt

We iterated through the contents of the files directory, which is located in the current working directory.

Filesystem Modification

Creating a directory

Path.mkdir() creates a new directory at the specified path with the default mode=0o777, which means the directory is accessible to all users and groups and has read, write, and execute permissions.

from pathlib import Path# Creating a new dir at the specified pathpath = Path('D:/SACHIN/Pycharm/pathlib_module/new_dir').mkdir(mode=0o777)

When we execute the above code, a new directory called new_dir is created in the pathlib_module directory.

If the path already exists, we will receive a FileExistsError. If we run the above code again, we'll get the following result.

FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'D:\\SACHIN\\Pycharm\\pathlib_module\\new_dir'

The path we specified already exists which is why the directory is not created. However, .mkdir() has an exist_ok parameter that, when set to True, ignores the error.

path = Path('D:/SACHIN/Pycharm/pathlib_module/new_dir').mkdir(exist_ok=True)print('Directory created.')----------Directory created.

Note: The path's final component should not be the existing non-directory file.

Creating a file

Path.touch() allows us to create a file with mode=0o666 at the specified path, indicating that the file has read and write permissions for all users and groups but no executable permission. The exist_ok parameter defaults to True.

from pathlib import Path# Creating a new file at the specified pathpath = Path('D:/SACHIN/Pycharm/pathlib_module/sample.txt').touch()

A file called sample.txt will be created. We'll get the FileExistError if we set exist_ok=False and run the code again.

# Creating a new file at the specified pathpath = Path('D:/SACHIN/Pycharm/pathlib_module/sample.txt').touch(exist_ok=False)----------FileExistsError: [Errno 17] File exists: 'D:\\SACHIN\\Pycharm\\pathlib_module\\sample.txt'

Renaming the files and directories

Methods like .with_name and .with_stem enable us to rename the file name of the specified path. To rename the files and directories, we can also use Path.rename().

from pathlib import Pathpath = Path('files')# Renaming the directorypath.rename('docs')print('Directory renamed successfully.')----------Directory renamed successfully.

The directory files will be renamed to the docs. What happens if the target file or directory name already exists? The code will raise a FileExistsError.

path = Path('docs')# Renaming the directorypath.rename('files')print('Directory renamed successfully.')----------FileExistsError: [WinError 183] Cannot create a file when that file already exists: 'docs' -> 'files'

The above code threw an error because the directory named files already exist in the project directory.

Removing the directory

Path.rmdir() deletes the directory specified in the path, but only if it is empty, otherwise, an OSError is raised.

from pathlib import Path# Removing the directory at the specified pathpath = Path('D:/SACHIN/Pycharm/pathlib_module/files').rmdir()

If we attempt to remove a directory that does not exist, we will receive a FileNotFoundError.

path = Path('D:/SACHIN/Pycharm/pathlib_module/files').rmdir()----------FileNotFoundError: [WinError 2] The system cannot find the file specified: 'D:\\SACHIN\\Pycharm\\pathlib_module\\files'

Reading and Writing Operations

Path class provides several methods to perform reading and writing operations on the file. Assume we have a text file with some data and we want to read and write that data.

Opening the file

Before reading or writing data to the file, the Path class provides a .open() method that opens the file specified by the path. You may have already used the built-in open(), this method works in the same way.

from pathlib import Path# Instantiating the path of the fileopen_file = Path('sample_file.txt')# Using the open() methodwith open_file.open(mode='r') as file:    # Reading the content    print(file.read())----------Hi, I am sample file for testing.

Reading the file

To read the content of the file specified by the path, we can use the Path.read_text() method.

from pathlib import Pathpath = Path('sample_file.txt').read_text(encoding='utf-8')print(path)----------Hi, I am sample file for testing.

Writing data to the file

The Path class provides a .write_text() method for writing text data to a file.

from pathlib import Path# Instantiating the path of the filepath = Path('sample_file.txt')# Writing data to the filepath.write_text('Hello from GeekPython.')# Reading the dataprint(path.read_text())----------Hello from GeekPython.

Similarly, we can use the Path.write_bytes() method to write binary data to a file. It opens the file in binary mode.

# Instantiating the path of the filepath = Path('sample_file.txt')# Writing binary data to the filepath.write_bytes(b'Hello from GeekPython.')# Reading the binary dataprint(path.read_bytes())----------b'Hello from GeekPython.'

We wrote the binary data to the sample_file.txt but if we look at the code, we read the file content using the .read_bytes() method.

Path.read_bytes() opens the file in binary mode and returns the contents of the file as a byte string.

Conclusion

The pathlib module provides high-level classes for manipulating file paths. These classes can be used to perform various operations on file paths as well as interact with files to perform I/O operations.

Let's recall what we've learned:

Pure path and Concrete path classes
Path operations using the PurePath class
Path class for instantiating concrete paths
Methods of the Path class
Reading and writing files
Modifying the filesystem

Reference - docs.python.org/3/library/pathlib.html

🏆Other articles you might be interested in if you liked this one

Perform high-level file operation using shutil module in Python.

Read and write zip files without extracting them in Python.

File handling in Python - Open, read, and write.

Generate temporary files and directories using tempfile module in Python.

What is the difference between seek() and tell() in Python?

A comprehensive guide to context manager and with statement in Python.

Open and read multiple files simultaneously using with statement in Python.

That's all for now

Keep Coding

Understanding the Different Uses of the Asterisk(*) in Python

Sachin Pal — Mon, 05 Jun 2023 15:12:23 GMT

You must have seen the asterisk or star symbol inside the parameterized function or used it to perform mathematical or other high-level operations.

We'll look at the different ways the asterisk(*) is used in Python in this article.

Asterisks(*) are used in the following situations:

Exponentiation and multiplication
Unpacking/Packing
Repetition
Formatting Strings
Tuple and Set Unpacking

Let's begin by breaking down the use of asterisks in the above cases one by one.

Multiplication

Python has arithmetic operators, one of which is an asterisk(*), which is commonly used to perform multiplication operations.

# Multiplication using asterisk(*)multiply = 9 * 7print(multiply)---------63

In the above code, we performed a simple arithmetic operation by multiplying 9 and 7 using the asterisk(*) as the multiplication operator.

And we can do this also.

def multiply(a, b):    return a * bproduct = multiply("Geek", 5)print(product)----------GeekGeekGeekGeekGeek

The above code defines the function multiply, which takes two parameters a and b, and returns the product of a and b multiplied.

We multiplied the "Geek" and 5 arguments and printed the result. As we can see in the output, the argument "Geek" was multiplied by 5.

Exponentiation

Similarly, we can use the asterisk to perform an exponentiation operation, but to get the exponential value of an integer, we must use a double asterisk(**).

# Exponentiation using asterisk(*)expo = 4 ** 4print(expo)----------256

We obtained 256 as the result of evaluating 4 raised to power 4(4**4). The double asterisk(**) is also known as the power operator.

Packing and Unpacking

Before we begin, we must understand that there are two types of arguments: positional arguments and keyword arguments.

Positional arguments are passed to the function based on their position, whereas keyword arguments are passed to the function based on their parameter name.

Consider the following example of positional and keyword arguments.

# Positional and Keyword argumentsdef friends(f1, f2, f3=None, f4=None):    return f1, f2, f3, f4# Passing two positional and two keyword argumentnames = friends("Rishu", "Yashwant", f3="Abhishek", f4="Yogesh")print(names)# Passing keyword argument before the possitional argumentnames1 = friends("Rishu", f4="Yogesh", "Yashwant")print(names1)

The function friends in the above code take two positional arguments, f1 and f2, and two keyword arguments, f3 and f4, with a default value of None.

In the first case, we used two positional arguments "Rishu" and "Yashwant" and two keyword arguments f3="Abhishek" and f4="Yogesh" to call the function.

You can see how "Rishu" and "Yashwant" were passed into the position of f1 and f2, respectively, and "Abhishek" and "Yogesh" were passed with the parameter names f3 and f4. However, we cannot change the position, as we did in the second case by passing the keyword argument before the positional argument; doing so will result in a SyntaxError.

If we try to pass an arbitrary number of arguments inside the function friends, we will get an error. As a result, we need a way to pass variadic arguments within the function.

We can use *args and **kwargs to pass variadic arguments in situations where we need to.

https://geekpython.in/understanding-args-and-kwargs-in-python-best-practices-and-guide

Argument Packing

Passing variadic positional arguments

# Use of *argsdef friends(*args):    return args# Passing variadic positional argumentsnames = friends("Rishu", "Yashwant", "Abhishek", "Yogesh")print(names)----------('Rishu', 'Yashwant', 'Abhishek', 'Yogesh')

The function friends takes *args, indicating that the function accepts a variable number of positional arguments. We passed an arbitrary number of positional arguments, which were saved in the tuple called args.

Passing variadic keyword arguments

# Use of **kwargsdef friends(**kwargs):    return kwargs# Passing variadic keyword argumentsnames = friends(f1="Rishu", f2="Yashwant", f3="Abhishek")print(names)

This time function accepts **kwargs, indicating that it accepts an arbitrary number of keyword arguments. We passed an arbitrary number of keyword arguments, which were saved in the dictionary called kwargs.

As a result, we can conclude that the asterisks with args and kwargs were used to pack the arguments.

Unpacking

We have used single and double asterisks as packing operators with args and kwargs, respectively, and we can also use them for unpacking.

The iterables are unpacked with a single asterisk(*), while the dictionaries are unpacked with a double asterisk(**).

# List of namesnames_lst = ["Sachin", "Rishu", "Yashwant", "Abhishek"]def greet(*names):    for name in names:        print(f'Welcome to GeekPython, {name}.')message = greet(*names_lst)

We have a list of names and a function greet that accepts a variable number of positional arguments and iterates through them to print some information.

You'll notice that we passed names_lst as *names_lst because this will unpack the items in the names_lst list.

Welcome to GeekPython, Sachin.Welcome to GeekPython, Rishu.Welcome to GeekPython, Yashwant.Welcome to GeekPython, Abhishek.

If we had passed the names_lst to the function, we would have gotten the following result.

message = greet(names_lst)----------Welcome to GeekPython, ['Sachin', 'Rishu', 'Yashwant', 'Abhishek'].

Similarly, we can unpack the dictionaries by using a double asterisk(**). Let us illustrate with an example.

# Dictionary with name and rolesocc_dict = {    "Sachin": "Python dev",     "Rishu": "C++ dev",     "Yashwant": "C++ dev",     "Abhishek": "PHP dev"}def occupation(**roles):    for key, value in roles.items():        print(f"{key} is a {value}.")occupation(**occ_dict)

The function occupation accepts **roles, which means we can pass an arbitrary number of keyword arguments when calling the function, and it then iterates through the key and value and prints them.

We called the function and passed the dictionary occ_dict prefixed with a double asterisk(**). This will unpack the dictionary's keys and values, producing the output shown below.

Sachin is a Python dev.Rishu is a C++ dev.Yashwant is a C++ dev.Abhishek is a PHP dev.

Alternatively, we can pass the keyword arguments in the function call as shown below.

occupation(Sachin="Python dev", Rishu="C++ dev")----------Sachin is a Python dev.Rishu is a C++ dev.

Repetition Of List And String

We can use an asterisk as a repetition operator, repeating the string or list for a specific number (similar to multiplying them). Let's understand with an example.

# Using asterisk as a repetition operatordata = "GeekPython" * 3print(data)print("-" * 20)lst_data = ["Geek", "Python"] * 3print(lst_data)

In the above code, we are essentially multiplying or repeating whatever you want to call it. The output will be the string and list being repeated three times.

GeekPythonGeekPythonGeekPython--------------------['Geek', 'Python', 'Geek', 'Python', 'Geek', 'Python']

Asterisk In String Formatting

You must have used different methods to format the strings in Python. We can use the asterisk in string formatting to unpack the tuples/lists values.

We'll use the .format string formatting method.

# Using asterisk in string formatting# Unpacking listdata = "{} {} {}.".format(*["Welcome", "to", "GeekPython"])print(data)# Unpacking tupleval = "{} {} {}.".format(*(1, 2, 3))print(val)

In the above code, we are just unpacking the list and tuple values using the single asterisk(*) passed within the .format string formatting.

Welcome to GeekPython.1 2 3.

This can only be used with the .format string formatting method, and the goal here is simply to demonstrate how we can use an asterisk while formatting strings.

Asterisks are not permitted to be used in other string formatting methods.

Tuple/Set Unpacking

We learned earlier in this article how to use * asterisks to unpack the iterable and ** asterisks to unpack the dictionary. In this section, we'll use the asterisk to unpack the values contained within the tuple/set.

x = {'One', 'piece', 'is', 'real'}print(type(x))print(*x)----------<class 'set'>One real is piece

The set {'One', 'piece', 'is', 'real'} is unpacked into One piece is real in the preceding example.

Here's an example of how to unpack a tuple.

x, *y, z = ('One', 'piece', 'is', 'real')print(x)print(y)print(z)----------One['piece', 'is']real

The output shows that the tuple unpacks into x=One, y=['piece', 'is'], and z=real.

Others

Another application of the asterisk could be in importing all of the classes and functions from the modules.

from time import *# Current time in secondscurr_time = time()print(f'Current time in sec: {curr_time}')# Formatted date and timet = strftime("%A, %d %B %Y %H:%M:%S")print(t)----------Current time in sec: 1685963439.6365585Monday, 05 June 2023 16:40:39

We were able to use the time and strftime functions from the time module because of the asterisk, which helps in the import of all functions and classes from the module.

Note: This is not a good practice at all, importing all the classes and functions from the module at a time can use up resources unnecessarily.

Conclusion

This article explains the use of asterisks in Python. We can use it as a multiplication operator or to pack and unpack arguments.

Let's recall what we've learned:

Using asterisk for exponentiation and multiplication
Argument packing/unpacking
Using as a repetition operator
Using asterisk in the string formatting
Unpacking the tuple and set values

🏆Other articles you might be interested in if you liked this one

The concept of constructors and initializers in Python.

An improved and modern way of string formatting.

Public, Protected, and Private access modifiers in Python.

What are inheritance and different types of inheritance in Python?

What do they do - __init__ and __call__ methods?

How to implement __getitem__, __setitem__, and __delitem__ in Python class?

Displaying images on the frontend using FastAPI.

That's all for now

Keep Coding

init Vs new Methods In Python - With Examples

Sachin Pal — Thu, 11 May 2023 17:39:57 GMT

You must have seen the implementation of the __init__ method in any Python class, and if you have worked with Python classes, you must have implemented the __init__ method many times. However, you are unlikely to have implemented or seen a __new__ method within any class.

In this article, we'll see:

Definition of the __init__ and __new__ methods
__init__ method and __new__ method implementation
When they should be used
The distinction between the two methods

init Vs new Method

The __init__ method is an initializer method that is used to initialize the attributes of an object after it is created, whereas the __new__ method is used to create the object.

When we define both the __new__ and the __init__ methods inside a class, Python first calls the __new__ method to create the object and then calls the __init__ method to initialize the object's attributes.

Most programming languages require only a constructor, a special method to create and initialize objects, but Python has both a constructor and an initializer.

Let's talk about both these methods one by one and implement these methods inside a Python class.

new Method

As stated already, the __new__ method is a constructor method used to create and return an object(instance of the class).

Syntax

object.__new__(cls, *args, **kwargs)

The __new__ method's first parameter is cls, which is a class of the object we want to create.

The *args and **kwargs parameters are not used by the __new__ method, but they must match the parameters of the class's __init__ method.

https://geekpython.in/understanding-args-and-kwargs-in-python-best-practices-and-guide

Example

# Defined a base classclass Name:    # Created a __new__ method    def __new__(cls):        print(f'Called the __new__ method.')        return super(Name, cls).__new__(cls)    # Created an __init__ method    def __init__(self):        print(f"Called the __init__ method.")# Created an objectName()

In the above code, we defined the __new__ and __init__ methods within the class Name. The __new__ method accepts the cls parameter, which is used to refer to the class Name, and when called, it prints the message and returns the class instance using the super(Name, cls).__new__(cls).

One thing to note is that the Name class is a base class, so we could have directly called the __new__ method on the object like this expression object.__new__(cls). However, the standard method is to use the super() function.

https://geekpython.in/super-in-python

The __init__ method is then called with the instance passed to the self parameter.

Then we called the Name class (Name()), and when we run the code, we get the output shown below.

Called the __new__ method.Called the __init__ method.

The output shows that the __new__ method is called first and then the __init__ method.

init Method

As we saw in the above example, the __init__ method is called to initialize the attributes of the object as soon as the object is created.

https://geekpython.in/init-and-call-method#heading-init-method

Syntax

__init__(self, *args, **kwargs)

As a first parameter, the __init__ method accepts self, which is used to refer to the class instance.

The parameters *args and **kwargs are used to initialize the instance variable with the values stored within them.

Example

# Defined a base classclass Name:    # Created a __new__ method    def __new__(cls, name):        print(f'Called the __new__ method.')        return super(Name, cls).__new__(cls)    # Created an __init__ method    def __init__(self, name):        print(f"Called the __init__ method.")        self.name = name# Created an objectname_obj = Name('Sachin')print(name_obj.name)

In the __init__ method, we passed the name parameter and did the same in the __new__ method to make the __new__ and __init__ method signature compatible with each other.

We called the class with the 'Sachin' argument, which will automatically invoke the __init__ method and will initialize the instance variable self.name with this value.

When we call the name attribute(instance variable) on the object name_obj, we'll get the following output.

Called the __new__ method.Called the __init__ method.Sachin

The name attribute(instance variable) of the name_obj is initialized to the value 'Sachin'.

Implementation

Let's define both __new__ and __init__ methods inside the class Language.

class Language:    def __new__(cls, *args):        return super().__new__(cls)    def __init__(self, lang, year):        self.lang = lang        self.year = yearlanguage = Language('Python', 1991)print(language.lang)print(language.year)----------Python1991

We defined both __new__ and __init__ methods inside the class Language and created the class object, when we run the code, Python will call the __new__ method which is responsible for creating and returning the object of the class, and then calls the __init__ method which is responsible for the initialization of the object's attributes(instance variables).

Now we can access the attributes of the object lang and year using dot notation on the object language as we did in the above code.

Every time we create a new object, the __init__ method is invoked, which means that if we don't return super().__new__(cls), then the __init__ method will not execute and return None.

class Language:    def __new__(cls, *args):        print("Creating")    # Method not called    def __init__(self, lang, year):        print("Initializing")        self.lang = lang        self.year = yearlanguage = Language('Python', 1991)print(language)----------CreatingNone

Let's see what happens when we implement only the __init__ method inside a class.

class Language:    def __init__(self, lang, year):        self.lang = lang        self.year = yearlanguage = Language('Python', 1991)print(language.lang)print(language.year)----------Python1991

The code works the same as the previous code which we saw at the beginning of this section.

When we instantiated the class using language = Language('Python', 1991), the expression is equivalent to the following:

language = object.__new__(Language)language.__init__('Python', 1991)

If we try to print the language object after calling the __new__ and the __init__ methods using the __dict__, then we'll get the following output:

language = object.__new__(Language)print(language.__dict__)language.__init__('Python', 1991)print(language.__dict__)----------{}{'lang': 'Python', 'year': 1991}

We got an empty dictionary after calling the __new__ method because the object was created but not yet initialized, to initialize, we called the __init__ method explicitly and got the values.

When To Use

Use case of new

Consider the following example in which we are using the __new__ method to customize the object at the instantiation.

class Reverse(str):    def __new__(cls, sequence):        return super().__new__(cls, sequence[::-1])seq = Reverse("GeekPython")print(seq)

The above code defines the class Reverse, which inherits from the str built-in type, as well as the __new__ method that accepts a sequence. We override the __new__ method to reverse the sequence before creating the object.

nohtyPkeeG

The argument "GeekPython" passed to the Reverse class got reversed due to sequence[::-1] before the object is created.

This can't be done using the __init__ method, if we try to do so, the result will be an error.

class Reverse(str):    def __init__(self, sequence):        super().__init__(sequence[::-1])seq = Reverse("GeekPython")print(seq)----------TypeError: object.__init__() takes exactly one argument (the instance to initialize)

Another use case of the __new__ method is creating a Singleton (design pattern that restricts the instantiation of a class to a single instance).

class Singleton:    # Created a private variable    __ins = None    # Defined the __new__ method    def __new__(cls):        if cls.__ins is None:            print("Instance creating...")            cls.__ins = super().__new__(cls)        return cls.__ins# Creating objectobj1 = Singleton()obj2 = Singleton()obj3 = Singleton()print(obj1)print(obj2)print(obj3)# Checking if they are all sameprint(obj1 is obj2 is obj3)

In the above code, we defined the class Singleton and created a private variable __obj to store the class's single instance, as well as a __new__ method that checks if the __ins is None, then creates a new instance and assigns it to the __ins, and returns the existing instance if the __ins is not None.

Then we printed three instances of the Singleton class named obj1, obj2, and obj3 and checked to see if they were all the same.

Instance creating...<__main__.Singleton object at 0x000001B3DFD5C130><__main__.Singleton object at 0x000001B3DFD5C130><__main__.Singleton object at 0x000001B3DFD5C130>True

All three instances point to the same memory address, and we can see that we got True, indicating that they are all the same.

Use case of init

The __init__ method is commonly used to initialize the object's attributes with or without the default values.

class Language:    def __init__(self, lang="Python", year=1991):        self.lang = lang        self.year = year    def show(self):        print(f'Language: {self.lang} | Founded: {self.year}.')language = Language()language.show()

The above code defines a Language class and the __init__ method, which accepts lang and year parameters with default values of "Python" and 1991, respectively.

When we call the Language class without argument, the __init__ method will set the lang and year attributes to their default values.

Language: Python | Founded: 1991.

Difference

Now that we've seen the definition, syntax, and implementation of both methods, we are now able to differentiate between them.

__new__ method	__init__ method
The `__new__` method is called first	The `__init__` method is called after the `__new__` method
Used to create and return the object	Used to initialize the attributes of the object
It is a constructor method	It is an initializer method
Takes class as the first parameter	Takes the instance of the class as the first parameter
Can be overridden to customize the object at the instantiation	Probably only be used to initialize the attributes of the object

Conclusion

Python has a concept of a constructor and an initializer method. The __new__ method is a constructor method whereas the __init__ method is an initializer method. Python first calls the __new__ method which is responsible for the object creation and then calls the __init__ method which is responsible for the initialization of the object's attributes.

🏆Other articles you might be interested in if you liked this one

Context managers and the with statement in Python.

What is abstract base class(ABC) in Python?

Public, Protected, and Private access modifiers in Python.

What are inheritance and different types of inheritance in Python?

What is enumerate() function in Python?

Execute dynamically generated code using the exec() in Python.

Async/Await - Asynchronous programming using asyncio in Python.

That's all for now

Keep Coding

Context Managers And The 'with' Statement In Python: A Comprehensive Guide With Examples

Sachin Pal — Fri, 05 May 2023 16:49:53 GMT

In this article, we'll look at context managers and how they can be used with Python's "with" statements and how to create our own custom context manager.

What Is Context Manager?

Resource management is critical in any programming language, and the use of system resources in programs is common.

Assume we are working on a project where we need to establish a database connection or perform file operations; these operations consume resources that are limited in supply, so they must be released after use; otherwise, issues such as running out of memory or file descriptors, or exceeding the maximum number of connections or network bandwidth can arise.

Context managers come to the rescue in these situations; they are used to prepare resources for use by the program and then free resources when the resources are no longer required, even if exceptions have occurred.

Why Use Context Manager?

As previously discussed, context managers provide a mechanism for the setup and teardown of the resources associated with the program. It improves the readability, conciseness, and maintainability of the code.

Consider the following example, in which we perform a file writing operation without using the with statement.

# Opening filefile = open('sample.txt', 'w')try:    # Writing data into file    data = file.write("Hello")except Exception as e:    print(f"Error Occurred: {e}")finally:    # Closing the file    file.close()

To begin, we had to write more lines of code in this approach, and we had to manually close the file in the finally block.

Even if an exception occurs, finally block will ensure that the file is closed. However, using the open() function with the with statement reduces the excess code and eliminates the need to manually close the file.

with open("sample.txt", "w") as file:    data = file.write("Hello")

In the preceding code, when the with statement is executed, the open() function's __enter__ method is called, which returns a file object. The file object is then assigned to the variable file by the as clause, and the content of the sample.txt file is written using the variable file. Finally, when the program exits execution, the __exit__ method is invoked to close the file.

We'll learn more about __enter__ and __exit__ methods in the upcoming sections.

We can check if the file is actually closed or not.

print(file.closed)----------True

We received the result True, indicating that the file is automatically closed once the execution exits the with block.

Using with Statement

If you used the with statement, it is likely that you also used the context manager. The with statement is probably most commonly used when opening a file.

# Opening a filewith open('sample.txt', 'r') as file:    content = file.read()

Here's a simple program that opens a text file and reads the content. When the open() function is evaluated after the with statement, context manager is obtained.

The context manager implements two methods called __enter__ and __exit__. The __enter__ method is called at the start to prepare the resource to be used, and the __exit__ method is called at the end to release resources.

Python runs the above code in the following order:

The with statement is executed, and the open() function is called.
The open() function's __enter__ method opens the file and returns the file object. The as clause then assigns the file object to the file variable.
The inner block of the code content = file.read() gets executed.
In the end, the __exit__ method is called to perform the cleanup and closing of the file.

Let's define and implement both these methods in a Python class and try to understand the execution flow of the program.

Creating Context Manager

The context manager will be created by implementing the __enter__ and __exit__ methods within the class. Any class that has both of these methods can act as a context manager.

Defining a Python class

# Creating a class-based context managerclass Conmanager:    def __enter__(self):        print("Context Manager's enter method is called.")    def __exit__(self, exc_type, exc_val, exc_tb):        print("Exit method is called...")        print(f'Exception Type: {exc_type}')        print(f'Exception Value: {exc_val}')        print(f'Exception Traceback: {exc_tb}')# Using the "with" stmtwith Conmanager() as cn:    print("Inner block of code within the 'with' statement.")

First, we created a class named Conmanager and defined the __enter__ and __exit__ methods inside the class. Then we created the Conmanager object and assigned it to the variable cn using the as clause. We will get the following output after running the above program.

Context Manager's enter method is called.Inner block of code within the 'with' statement.Exit method is called...Exception Type: NoneException Value: NoneException Traceback: None

When the with block is executed, Python orders the execution flow as follows:

As we can see from the output, the __enter__ method is called first.
The code contained within the with statement is executed.
To exit the with statement block, the __exit__ method is called at the end.

We can see in the output that we got None values for the exc_type, exc_val, and exc_tb parameters passed inside the __exit__ method of the class Conmanager.

When an exception occurs while executing the with statement, these parameters take effect.

exc_type - displays the type of exception.
exc_val - displays the message of the exception.
exc_tb - displays the traceback object of the exception.

Consider the following example, which shows how these parameters were used when an exception occurred.

# Creating a class-based context managerclass Conmanager:    def __enter__(self):        print("Enter method is called.")        return "Do some stuff"    def __exit__(self, exc_type, exc_val, exc_tb):        print("Exit method is called...")        print(f'Exception Type: {exc_type}')        print(f'Exception Value: {exc_val}')        print(f'Exception Traceback: {exc_tb}')# Using the "with" stmtwith Conmanager() as cn:    print(cn)    # Raising exception on purpose    cn.read()

When we run the above code, we get the following result.

Enter method is called.Do some stuffExit method is called...Exception Type: 'AttributeError'>Exception Value: 'str' object has no attribute 'read'Exception Traceback: Traceback (most recent call last):  ....    cn.read()AttributeError: 'str' object has no attribute 'read'

Instead of getting None values, we got the AttributeError, as shown in the output above and those three parameters displayed certain values.

exc_type displayed the value.
exc_val displayed the 'str' object has no attribute 'read' message.
exc_tb displayed the value.

Example

In the following example, we've created a context manager class that will reverse a sequence.

class Reverse:    def __init__(self, data):        self.data = data    def __enter__(self):        self.operate = self.data[:: -1]        return self.operate    def __exit__(self, exc_type, exc_val, exc_tb):        passwith Reverse("Geek") as rev:    print(f"Reversed string: {rev}")

We've created a class called Reverse and defined the __init__ method, which takes data, the __enter__ method, which operates on the data and returns the reversed version of it, and the __exit__ method, which does nothing.

Then we used the with statement to call the context manager's object, passing the sequence "Geek" and assigning it to the rev using the as clause before printing it. We will get the following output after running the above code.

Reversed string: keeG

The upper code contains a flaw because we did not include any exception-handling code within the __exit__ method. What if we run into an exception?

with Reverse("Geek") as rev:    # Modified the code from here    print(rev.copy())

We changed the code within the with statement and attempted to print the rev.copy(). This will result in an error.

Traceback (most recent call last):  ....    print(f"Reversed string: {rev.copy()}")AttributeError: 'str' object has no attribute 'copy'

Exception Handling

Let's include the exception handling code in the __exit__ method.

class Reverse:    def __init__(self, data):        self.data = data    def __enter__(self):        self.operate = self.data[:: -1]        return self.operate    def __exit__(self, exc_type, exc_val, exc_tb):        if exc_type is None:            return self.operate        else:            print(f"Exception occurred: {exc_type}")            print(f"Exception message: {exc_val}")            return Truewith Reverse("Geek") as rev:    print(rev.copy())print("Execution of the program continues...")

First, we defined the condition to return the reversed sequence if the exc_type is None, otherwise, return the exception type and message in a nicely formatted manner.

Exception occurred: 'AttributeError'>Exception message: 'str' object has no attribute 'copy'Execution of the program continues...

The exception was handled correctly by the __exit__ method, and because we returned True when the error occurs, the program execution continues even after exiting the with statement block and we know because the print statement was executed which is written outside the with block.

Conclusion

Context managers provide a way to manage resources efficiently like by preparing them to use and then releasing them after they are no longer needed. The context managers can be used with Python's with statement to handle the setup and teardown of resources in the program.

However, we can create our own custom context manager by implementing the enter(setup) logic and exit(teardown) logic within a Python class.

In this article, we've learned:

What is context manager and why they are used
Using context manager with the with statement
Implementing context management protocol within a class

🏆Other articles you might be interested in if you liked this one

Understanding the basics of abstract base class(ABC) in Python.

Implement __getitem__, __setitem__ and __delitem__ in Python class to get, set and delete items.

Generate and manipulate temporary files using tempfile in Python.

Using match-case statement for pattern matching in Python.

Comparing the sort() and sorted() function in Python.

Using super() function to implement attributes and methods of the parent class within the child class.

Using str and repr to change string representation of the objects in Python.

That's all for now

KeepCoding

How To Display Local And Web Images In Jupyter Notebook

Sachin Pal — Mon, 01 May 2023 15:30:39 GMT

Jupyter Notebook is most commonly used for data science, machine learning, and visualization. It is an open-source web application that allows us to write and share code.

Jupyter Notebook includes cells that allow us to run our program in sections, making it more interactive and easier for developers to debug, analyze, and test their code.

We sometimes need to display images for analysis or testing purposes. In this article, we'll look at how to load and display images in Jupyter Notebook.

Using IPython

IPython is an advanced and interactive shell for Python and is used as a kernel for Jupyter. We'll use some of its classes within the Jupyter Notebook.

IPython Notebook is now known as Jupyter Notebook. The Jupyter project began as IPython and IPython Notebook.

The Image class from the IPython.display is the most commonly used method for displaying images in the Jupyter Notebook. Let us illustrate with an example.

from IPython.display import ImageImage("D:/SACHIN/Desktop/pirate.jpg", width=200, height=100)

First, we imported the Image class from IPython's display module, then called it with the absolute path to the source image and set the width and height.

Run the above code in a Jupyter Notebook cell, and the image will appear in the output.

We can, however, show web images. We only need to specify the image's web address, and then run the cell to see the image.

Image("https://i.pinimg.com/564x/ed/72/22/ed7222b13f4ca9b4def7652a1539ef5f.jpg", width=200, height=100)

Since we already imported the Image class so we don't need to import it again, we called the Image class and passed the image's web address with specific width and height.

The specific image will be displayed in the output after the cell has been run.

Using Matplotlib

The Matplotlib library is used to display data in a variety of charts, plots, and graphs. However, we can use this library to visualise our images.

We'll see two approaches to reading an image using Matplotlib and displaying it in the Jupyter Notebook.

First Approach

# Importing required libsimport matplotlib.pyplot as pltfrom PIL import Image# Opening imageimg = Image.open('D:/SACHIN/Desktop/pirate.jpg')# Plotting the imageplotimg = plt.imshow(img)

First, we imported the required libraries which are as follows:

matplotlib.pyplot - to plot the image
PIL - for image processing

Then we passed the path to the image inside the Image.open.

The image was then plotted using the imshow function from matlplotlib.pyplot, with the image passed inside. Run the cell and the image will be displayed.

This method also has a variant in which we can use the cv2 library instead of PIL. The code appears to be as follows:

# Importing libsimport cv2import matplotlib.pyplot as plt# Reading the imageimg = cv2.imread('D:/SACHIN/Desktop/pirate.jpg')# Converting image into RGB imagergb_img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)# Plotting the imageplotimg = plt.imshow(rgb_img)

First, we imported the libraries, and then we read the image with cv2.imread, converted it from BGR to RGB, and plotted it with Matplotlib.

Second Approach

# Importing required libsimport matplotlib.pyplot as pltimport matplotlib.image as pltimg# Reading the imageimage = pltimg.imread("D:/SACHIN/Desktop/pirate.jpg")# Plotting the imageplt.imshow(image)

The libraries were imported as follows:

matplotlib.pyplot - to plot the images
matplotlib.image - to read the image

This time we didn't use the PIL or cv2 library instead in this method, we used the image module from the matplotlib library and then called the imread function, passing the path to the image to display.

The image was then plotted using the imshow function. When we run the cell, the image will appear.

Displaying Web Images

What if we wanted to use Matplotlib to display web images? We can accomplish our goal with the assistance of a few libraries. Let's take a look at the code.

# Importing required libsimport matplotlib.pyplot as pltfrom PIL import Imageimport urllibimg = Image.open(urllib.request.urlopen('https://i.pinimg.com/564x/ed/72/22/ed7222b13f4ca9b4def7652a1539ef5f.jpg'))# Plotting the imageplotimg = plt.imshow(img)

The additional library we imported is urllib, which will assist us in working with URLs.

We opened the URL of the image with urllib.request.urlopen and passed it inside Image.open. The image was then plotted using the imshow function.

Using Markdown

We don't need to write any code for this technique; we just need to add a markdown command to display the image.

![Pirate](pirate.jpg)

Before running the above markdown command, be sure to set the cell to Markdown mode.

After changing the cell mode to Markdown, run the cell and the image will be displayed.

The same process also goes for the web image.

![Pirate](https://i.pinimg.com/564x/ed/72/22/ed7222b13f4ca9b4def7652a1539ef5f.jpg)

We can also use plain HTML to set the width and height of the image to our specifications.

<img src="pirate.jpg" width=200 height=100 />

Run the cell in Markdown mode and the image will be displayed.

Conclusion

We've seen how to display local and web images using various methods in the Jupyter Notebook. We've seen the following methods:

IPython Implementation
Matplotlib and PIL are used.
Making Use of Markdown

The first method is the most common way to display images in Jupyter Notebook, but the other two methods are equally effective. Which method we use is entirely up to us.

🏆Other articles you might be interested in if you liked this one

Upload and display images on the frontend using Flask in Python.

Display images on the frontend using the FastAPI framework.

Generate and manipulate temporary files using tempfile in Python.

Using match-case statement for pattern matching in Python.

Get started with the FastAPI web framework to build REST APIs.

Understanding the basics of Python's ABC.

Using __str__ and __repr__ to change string representation of the objects in Python.

That's all for now

KeepCoding

How To Convert Bytes To A String - Different Methods Explained

Sachin Pal — Fri, 28 Apr 2023 15:30:39 GMT

In Python, a byte string is a sequence of bytes, which are the fundamental building blocks of digital data such as images, audio and videos. Byte strings differ from regular strings in that they are made up of bytes rather than characters.

Sometimes we work on projects where we need to handle bytes, and we needed to convert them into Python strings in order to perform specific operations.

In this article, we'll see the ways how we can convert the bytes string into the normal string in Python.

Bytes string

In Python, a byte string can be generated by prefixing the character "b" before the string's quotation mark. The following example will demonstrate how to generate a byte string.

byte_str = b"GeekPython"

We created a byte string containing the characters "G", "e", "e", "k", "P", "y", "t", "h", "o" and "n".

The upper byte string was straightforward and easy to generate, but the byte string of any image would be different from what we saw in the upper part.

These bytes combine to make an image. These byte strings vary based on the type of data. We'll see the methods to convert the byte string into a normal string.

Method 1 - decode method

The decode method is the most commonly used method by developers. The decode method converts a byte string into a normal string using the specified encoding. Let us illustrate with an example.

# Byte stringbyte_str = b"GeekPython"# Convertingnor_str = byte_str.decode(encoding='utf-8')print(nor_str)# Checking the type of stringprint(f'Type: {type(nor_str)}')----------GeekPythonType: <class 'str'>

We used the decode method on the variable byte_str, which contains a byte string, and set the encoding to utf-8. The output shows that our byte string was converted into a normal string.

Here's an example of converting the image's byte to a string. We first saved the image's bytes in a file before converting them to a normal string.

with open('binary_file', 'rb') as file:    chars = file.read()    print(f'Content type in file before: {type(chars)}')    # print(chars)    decoded = chars.decode('utf-8', errors='ignore')    # print(decoded)    print(f'Content type in file after: {type(decoded)}')----------Content type in file before: <class 'bytes'>Content type in file after: <class 'str'>

Note: utf-8 encoding is unlikely to be used to decode the image's byte, and if it is, the decoding will produce mojibake(garbled text).

Method 2 - codecs module

It's the same method as before, but this time we'll use the decode method from Python's codecs module.

import codecsbstr = b'\xa3'emoji = b'\xF0\x9F\x98\x86\xF0\x9F\x98\x81\xF0\x9F\x98\x82'char_dec = codecs.decode(bstr, encoding='cp1252')print(char_dec)print(f'Type(Before decoding): {type(bstr)}')print(f'Type(After decoding): {type(char_dec)}')print('-'*20)dec = codecs.decode(emoji, encoding='utf-8')print(dec)print(f'Type(Before decoding): {type(emoji)}')print(f'Type(After decoding): {type(dec)}')

In the first block of code, we decoded the bytes stored in the variable bstr and specified the cp1252 encoding (used for decoding single-byte Latin alphabet characters).

In the second block of code, we decoded the emoji bytes using the default encoding.

Type(Before decoding): 'bytes'>Type(After decoding): 'str'>--------------------😆😁😂Type(Before decoding): 'bytes'>Type(After decoding): 'str'>

Method 3 - str method

In this approach, we'll use the most basic technique, which is the str method. The str method converts data to a string, which we'll use to convert the byte string to a regular string.

byte_str = b'GeekPython'print(type(byte_str))print('-'*20)# Using str method with encodingnormal_str = str(byte_str, 'utf-8')print(normal_str)print(type(normal_str))print('-'*20)# Using str method without encodingwithout_encoding = str(byte_str)print(without_encoding)print(type(without_encoding))

In the first block of code, we used the str method and passed a byte string with the utf-8 encoding. In the second block of code, we did the same thing as in the first, but we didn't specify the encoding.

'bytes'>--------------------GeekPython'str'>--------------------b'GeekPython''str'>

We can see a difference in both outputs, but they are both in string format.

Comparing execution time

We can compare the execution time of these three methods to see which one is the fastest.

import timeitprint("Execution time of decode method:")print(timeit.timeit(stmt='byte_str=b"GeekPython";n=byte_str.decode("utf-8")'))print('-'*20)print("Execution time of codecs.decode method:")print(timeit.timeit(setup="import codecs", stmt='byte_str=b"GeekPython";n=codecs.decode(byte_str, "utf-8")'))print('-'*20)print("Execution time of str method:")print(timeit.timeit(stmt='byte_str=b"GeekPython";n=str(byte_str, "utf-8")'))

We measured the execution time of the code snippets using the timeit module.

Execution time of decode method:0.14236710011027753--------------------Execution time of codecs.decode method:0.7000259000342339--------------------Execution time of str method:0.177455399883911

The decode method code snippet took less time to execute than the other two methods. The execution time difference between the decode method and the str method is not that big.

Conclusion

In this article, we've learned the different methods to convert the byte string into the regular string. We've seen three methods which are as follows:

using the decode method
using the codecs.decode method
using the str method

These three methods can be used to convert a byte string to a regular string, but the first choice for the developers can be the decode method because it is simpler and consumes less time than the other two methods.

🏆Other articles you might be interested in if you liked this one

Here's how we can format the string in different ways.

Number the iterable objects using the enumerate() function in Python.

Different ways to remove whitespaces from the string.

How do bitwise operators work behind the scenes in Python?

What are args and kwargs parameters within the function in Python?

Asynchronous programming in Python using asyncio module.

Create a virtual environment to create an isolated space for projects in Python.

That's all for now

Keep Coding

Difference Between insert(), append() And extend() In Python With Examples

Sachin Pal — Thu, 20 Apr 2023 14:30:39 GMT

List is one of Python's built-in data structures for storing collections of modifiable and ordered data. Lists can hold a variety of data types, including strings, integers, and even lists.

Lists are mutable, which means they can be created and modified. Python provides some methods for modifying the data within the list.

This article will explain the distinctions between the list insert(), append(), and extend() methods. We'll see how they differ and how they're used to modify the list.

Methods Objective

Every Python developer would have worked with lists and used these methods periodically. List insert(), append(), and extend() are somewhat related because they are used to add elements to a list.

However, these methods are syntactically and programmatically distinct, each method has their own way of adding data to the list.

List insert()

We already know what is the meaning of insert, it means putting or adding a particular element into the data. This method does the same work as its name meaning.

Python list insert() method helps to insert an element to the list at the desired position. This method takes two arguments, the first argument is the index at which the element is to be inserted and the second argument is the element that will be inserted.

list.insert(index, element)

# List of namesfriends = ['Sachin', 'Rishu', 'Yashwant']# Printing the original listprint('Original List:', friends)# Inserting an elementfriends.insert(1, 'Abhishek')# Printing the updated listprint('New List: ', friends)----------Original List: ['Sachin', 'Rishu', 'Yashwant']New List:  ['Sachin', 'Abhishek', 'Rishu', 'Yashwant']

Using the insert() method, we inserted the string 'Abhishek' at the 1st index in my_lst.

List append()

Python list append() method adds the element to the end of the list. It takes only one argument which is the element to be appended to the list.

list.append(element)

# List of namesfriends = ['Sachin', 'Rishu', 'Yashwant']# Printing the original listprint('Original List:', friends)# Appending an elementfriends.append(['Abhishek', 'Yogesh'])# Printing the updated listprint('New List: ', friends)----------Original List: ['Sachin', 'Rishu', 'Yashwant']New List:  ['Sachin', 'Rishu', 'Yashwant', ['Abhishek', 'Yogesh']]

List extend()

We've seen how we can use append() to add an element to the end of a list. List extend() does the same thing, but instead of adding a single element, it adds each item from the iterable, resulting in the list being extended.

list.extend(iterable) or list.extend(['elem1', 'elem2', 'elem3'])

# List of namesfriends = ['Sachin', 'Rishu', 'Yashwant']# Printing the original listprint('Original List:', friends)# Using extend() to extend the listfriends.extend(['Abhishek', 'Yogesh'])# Printing the updated listprint('New List: ', friends)----------Original List: ['Sachin', 'Rishu', 'Yashwant']New List:  ['Sachin', 'Rishu', 'Yashwant', 'Abhishek', 'Yogesh']

We passed the list in the expression friends.extend(['Abhishek', 'Yogesh']) and as we can see the items became part of the original list.

Difference

insert()	append()	extend()
Used to add the element at the desired position in the list	Used to add the element at the end of the list	Used to add the elements from the iterable at the end of the list
`list.insert(index, element)`	`list.append(element)`	`list.extend(iterable)`
Takes two parameters	Takes a single parameter	Takes a single iterable parameter
Can take iterable but adds it as it is	Can take iterable but adds it as it is	Takes iterable and each item is added individually
List length increases by 1	List length increases by 1	Length of the list increases by the number of items in the iterable
Time complexity is constant i.e., O(1)	Time complexity is linear i.e., O(n)	Has time complexity of O(x), where x is the length of the iterable

Time complexity refers to the computer time to run an algorithm.

Conclusion

We've seen how the list insert(), append(), and extend() methods differ from one another in this article. The goal of these methods is the same in that they add elements to the list, but they differ in how they add the elements to the list.

We've compared these methods using the code examples and here's a recap of what we've learned

insert() - This method is used to insert the element at the desired position in the list
append() - This method is used to add the element to the end of the list.
extend() - This method is used to add each item from the iterable to the end of the list.

🏆Other articles you might be interested in if you liked this one

8 different ways to reverse a Python list.

How Python reverse() and reversed() differ from each other.

Different ways to remove whitespaces from the string.

Different types of string formatting in Python.

Build a Covid-19 EDA and Viz streamlit app in Python.

What is *args and **kwargs in Python.

Powerful one-liners in Python to boost the code.

That's all for now

Keep Coding

Accessing List Values Within The Dictionary In Python

Sachin Pal — Tue, 18 Apr 2023 14:31:38 GMT

The dictionary is a data structure in Python that belongs to the mapping category. When data(key-value) is enclosed by curly({ }) braces, we can say it is a dictionary.

A dictionary has a key that holds a value, also known as a key-value pair. Using the dictionary[key], we can get the value assigned to the key.

What if the dictionary contains the values in the form of a list? In this article, we'll look at all of the different ways to access items from lists within the dictionary.

Sample data

Given data contains items in a list as the values of keys within the dictionary. We'll work with this given data throughout the article.

Using Indexing

In the following example, we've used the indexing method and passed the index along with the key from which the value is to be extracted.

The convention dict_name[key][index] must be used to access the list values from the dictionary.

# Using indexing# Accessing list items from the keyacc_key = my_data['Python dev']print(acc_key)# Accessing first list item from the keyacc_data = my_data['C++ dev'][0]print(acc_data)# Accessing third list item from the keyacc_data1 = my_data['PHP dev'][2]print(acc_data1)# Accessing second list item from the key 'name' from# the dictionary 'detail' within the dictionary 'my_data'acc_dict_item = my_data['detail']['name'][1]print(acc_dict_item)

Output

['Sachin', 'Aman', 'Siya']RishuShaliniSiya

When we accessed only the key, we got the entire list, but when we specified the index number with the key, we got the specific values.

Using slicing

It's the same as the previous method, except instead of specifying the index value, we've used the slicing range.

# Using slicing method# Accessing list items by skipping 2 stepsval = my_data['Python dev'][:3:2]print(val)# Accessing first list item from the endval1 = my_data['C++ dev'][-1:]print(val1)# Accessing list items except the last itemval2 = my_data['PHP dev'][:-1]print(val2)

Output

['Sachin', 'Siya']['Rahul']['Abhishek', 'Peter']

Using for loop

The for loop has been used in the following example to iterate over the values of the list in the dictionary my_data.

# Using for loop# 1 - Accessing key and values from the dict 'my_data'for key, value in my_data.items():    print(key, value)print('-'*20)# 2 - Iterating over list items from each keyfor key, value in my_data.items():    for items in value:        print(f'{key} - {items}')print('-'*20)# 3 - Iterating list items from nested dictionary 'detail'for key, value in my_data['detail'].items():    for items in value:        print(f'{key} - {items}')print('-'*20)# 4 - Accessing list item from individual keyfor value in my_data['PHP dev']:    print(value)

We iterated both keys and values from the dictionary my_data in the first block of code, then the list items from each key in the second block, the list items from each key within the nested dictionary detail in the third block, and the list items of the specific key in the last block of code.

Output

Python dev ['Sachin', 'Aman', 'Siya']C++ dev ['Rishu', 'Yashwant', 'Rahul']PHP dev ['Abhishek', 'Peter', 'Shalini']detail {'name': ['Sachin', 'Siya'], 'hobby': ['Coding', 'Reading']}--------------------Python dev - SachinPython dev - AmanPython dev - SiyaC++ dev - RishuC++ dev - YashwantC++ dev - RahulPHP dev - AbhishekPHP dev - PeterPHP dev - Shalinidetail - namedetail - hobby--------------------name - Sachinname - Siyahobby - Codinghobby - Reading--------------------AbhishekPeterShalini

Using list comprehension

The list comprehension technique is used in the following example. It's the same method as before, but this time we've used the for loop within the list.

# Using list comprehension# Accessing list item from the key 'Python dev'lst_com = [item for item in my_data['Python dev']]print(lst_com)# Accessing list item from nested dictionarylst_com1 = [item for item in my_data['detail']['name']]print(lst_com1)

Output

['Sachin', 'Aman', 'Siya']['Sachin', 'Siya']

Using unpacking(*) operator

You've probably used the asterisk(*) operator for multiplication, exponentiation, **kwargs, *args, and other things, but we're going to use it as an unpacking operator.

# Using unpacking operator# Accessing values from every keyusing_unpkg_op = [*my_data.values()]print(using_unpkg_op)print('-'*20)print(using_unpkg_op[0])print('-'*20)# Accessing list items from the key 'Python dev'using_unpkg_op = [*my_data.get('Python dev')]# Accessing first item from the listprint(using_unpkg_op[0])

The expression *my_data.values() in the preceding code will return all the values of the keys in the dictionary my_data, and once the values are returned, we can specify the index number to access specific data.

The expression *my_data.get('Python dev') returns the values of the key Python dev, and through indexing, we can access the list item.

Output

[['Sachin', 'Aman', 'Siya'], ['Rishu', 'Yashwant', 'Rahul'], ['Abhishek', 'Peter', 'Shalini'], {'name': ['Sachin', 'Siya'], 'hobby': ['Coding', 'Reading']}]--------------------['Sachin', 'Aman', 'Siya']--------------------Sachin

Conclusion

In this article, we've seen the different methods to access list items within the dictionary. The dictionary we've operated on contains the keys with the values as a list and also a nested dictionary.

We've used five methods to access the list items from the dictionary which are as follows:

Indexing - Using the bracket notation([ ])
Slicing - Using the list slicing
Iterating - Using the for loop
List comprehension technique
Unpacking(*) operator

Well, this is it for the article, try to find some more methods to access list items within the dictionary.

🏆Other articles you might be interested in if you liked this one

How list reverse() is different from the reversed() in Python?

8 different ways to reverse the Python list.

Python one-liners that will make your code more efficient.

Understanding NumPy argmax function in Python.

How to read multiple files simultaneously using the with statement in Python?

Using asynchronous(async/await) programming in Python.

Different ways to remove whitespaces in Python.

That's all for now

Keep Coding

How To Solve CAPTCHA In A Few Steps In Python Using 2captcha

Sachin Pal — Sun, 16 Apr 2023 05:30:39 GMT

You may have solved millions of captchas for verification, which can be a tedious chore when you need to authenticate your identity as a human. These websites offer an additional layer of security by utilizing captcha services to prevent bots and automation scripts from accessing their website.

What is CAPTCHA?

CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) is a form of challenge-response test used to assess whether or not a user is human.

CAPTCHAs display jumbled, scrambled, or distorted text, images, numbers, and audio that is tough for computers to solve but relatively easy for people to solve. These are used to protect websites and applications against harmful actions.

2captcha API

There are numerous applications and services available to assist us in solving a captcha in seconds. 2captcha is a website that offers captcha-solving services.

2captcha provides API for various programming languages, including Python, and we'll utilise the 2captcha-python library to solve the captcha in a few steps.

Installing dependencies

The primary work is to install the Python library called 2captcha-python. Run the following command in your terminal.

pip install 2captcha-python

Please keep in mind that we are using pip to install the library. If you are not using pip, the installation command will be different.

Solving reCAPTCHA

We'll use a reCAPTCHA testing website and here's the URL of the website https://patrickhlauke.github.io/recaptcha/. It provides a reCAPTCHA widget for testing purposes.

Code

First, we must import the library and create an instance of the class TwoCaptcha, which we will initialise with the API key. To make a POST request to the target site, we also need the requests library.

# Imported the required libsfrom twocaptcha import TwoCaptchaimport requests# Instance of TwoCaptchasolver = TwoCaptcha('YOUR_API_KEY')

After we've built the instance, the next step is to obtain the token from the 2captcha server.

# Target website URLsite_url = "https://patrickhlauke.github.io/recaptcha/"# Getting the tokentry:    # For solving reCAPTCHA    token = captcha_solver.recaptcha(        sitekey='6Ld2sf4SAAAAAKSgzs0Q13IZhY02Pyo31S2jgOB5',        url=site_url    )# Handling the exceptionsexcept Exception as e:    raise SystemExit('Error: CAPTCHA token not recieved.')# Print the token and exit the programelse:    SystemExit('Token- ' + str(token))

We called the recaptcha method from the instance captcha_solver within the try block and passed the reCAPTCHA data-sitekey to the sitekey parameter and the URL of the target site to the url parameter.

To get the sitekey from the website, open it in inspect mode and look for the tag with title="reCAPTCHA" and you'll find the key within the src attribute.

If any exceptions are triggered within the except block, the program will display an error message and exit. If the program is successfully executed, it will return the token that will be used to solve reCAPTCHA and then quit.

The final step will be to transfer the token to the target site, which will be accomplished by sending a POST request to the target site via the requests library.

# Imported the required libsfrom twocaptcha import TwoCaptchaimport requests# Instance of TwoCaptchacaptcha_solver = TwoCaptcha('YOUR_API_KEY')# Target website URLsite_url = "https://patrickhlauke.github.io/recaptcha/"# Getting the tokentry:    # For solving reCAPTCHA    token = captcha_solver.recaptcha(        sitekey='6Ld2sf4SAAAAAKSgzs0Q13IZhY02Pyo31S2jgOB5',        url=site_url    )# Handling the exceptionsexcept Exception as e:    raise SystemExit('Error: CAPTCHA token not recieved.')# Sending the token to the websiteelse:    # Sending spoof user-agent    headers = {'user-agent': 'Mozilla/5.0 Chrome/52.0.2743.116 Safari/537.36'}    # Sending recieved token    data = {'recaptcha-token': str(token)}    # Making POST request to the target site    token_response = requests.post(site_url, headers=headers, data=data)    print('Token sent')

We changed the else part of the code, and the result is the code seen above. Before performing the POST request to the target site, we've established certain configurations to send along with the request within the else block.

We used the requests library's post function to send a POST request to the target site, passing the site_url (target site URL), headers (containing the user-agent), and data (containing token). The data can contain more parameters depending on the form.

Note: The received token from the 2captcha servers will only be valid for 120 seconds, so you need to submit the token to the target site within the time limit.

Solving hCAPTCHA

The URL of the hCAPTCHA demo site is https://accounts.hcaptcha.com/demo.

The procedure will be the same as with reCAPTCHA, with the exception that we must modify the method to hcaptcha from the 2captcha-python package.

# Imported the required libsfrom twocaptcha import TwoCaptchaimport requests# Instance of TwoCaptchacaptcha_solver = TwoCaptcha('YOUR_API_KEY')# Target website URLsite_url = "https://accounts.hcaptcha.com/demo"# Getting the tokentry:    # For solving hCAPTCHA    token = captcha_solver.hcaptcha(        sitekey='a5f74b19-9e45-40e0-b45d-47ff91b7a6c2',        url=site_url    )# Handling the exceptionsexcept Exception as e:    raise SystemExit('Error: CAPTCHA token not recieved.')# Sending the token to the websiteelse:    # Sending spoof user-agent    headers = {'user-agent': 'Mozilla/5.0 Chrome/52.0.2743.116 Safari/537.36'}    # Sending recieved token    data = {'hcaptcha-token': str(token)}    # Making POST request to the target site    token_response = requests.post(site_url, headers=headers, data=data)    print('Token sent')

We'll get the sitekey of the hCAPTCHA within the src attribute of the tag.

Conclusion

We've used the 2captcha-python package for solving the reCAPTCHA and hCAPTCHA. The 2captcha-python package provided by 2captcha provides captcha-solving services and almost all types of captcha can be solved using their APIs. They provide a browser extension for solving CAPTCHAs.

str & repr: Change String Representation In Python

Sachin Pal — Thu, 13 Apr 2023 05:30:39 GMT

In the program output, we can represent Python strings in two ways. Python supports both informal and formal string representations. When we run the Python program to print the string, we get an informal representation of it in the output.

The __str__ method in Python is responsible for the informal representation of the object, which we can change to the formal representation by using the __repr__ method.

In this article, we'll discuss these dunder methods named __str__ and __repr__ and how they are used for changing the representation of the string.

str() and repr() method

The __str__() method returns the object's easily-readable or informal string representation. Python calls the __str__() method internally when we call functions like str() and print().

The __repr__() method, unlike the __str__() method, returns a more informative or formal string representation of the object. When the __str__() method for the class is not defined, the object calls the __repr__() method.

Overall, the __str__() method is meant for users, whereas the __repr__() method is meant for developers and can be more useful when debugging.

Default implementation

We can implement these methods within the Python class to see them in action. Let's look at the following example.

# Python class is createdclass Product:    def __init__(self, name, category):        self.name = name        self.category = category# Instantiated the classdata = Product('Ford', 'Car')# Implemented the __str__() methodprint(data.__str__())print(data.__repr__())print(data)

We created the Product class and defined the __init__ function that takes name and category. Then we instantiated the class Product with the necessary arguments.

Then we implemented the __str__() and __repr__() methods to the object of the class Product.

<__main__.Product object at 0x000002157F10C370><__main__.Product object at 0x000002157F10C370><__main__.Product object at 0x000002157F10C370>

When we ran the above code, we got the object's location in the processor's memory as a hexadecimal number.

This happened because we didn't implement the __str__() and __repr__() methods within our class Product. Thus, calling the __str__() method calls the default __repr__() method and shows the same output.

Custom str() method

In the following code, we implemented the custom __str__() method that returns a string with some information and then we executed the code to see what would be the output.

# Python class is createdclass Product:    def __init__(self, name, category):        self.name = name        self.category = category    # Creating __str__() function    def __str__(self):        return f'The {self.name} belongs to category {self.category}.'# Instantiated the classdata = Product('Ford', 'Car')# Implemented the __str__() methodprint(data.__str__())print(data.__repr__())print(data)

Output

The Ford belongs to category Car.<__main__.Product object at 0x00000156D7E5C370>The Ford belongs to category Car.

We got the string defined in the class's __str__() method when we called the __str__() and print() methods on the object data, but not when we called the __repr__() method on the object data.

Only repr() method

What if we only implement the __repr__() method within the class Product?

We've added the __repr__() method to the Product class, which returns a string containing the product name and category.

# Python class is createdclass Product:    def __init__(self, name, category):        self.name = name        self.category = category    # Creating __repr__() function    def __repr__(self):        return f'Product: {self.name} - Category: {self.category}.'# Instantiated the classdata = Product('Ford', 'Car')# Called the __str__() methodprint(data.__str__())# Called the __repr__() methodprint(data.__repr__())print(data)# Called the str() functionprint(str(data))

Output

Product: Ford - Category: Car.Product: Ford - Category: Car.Product: Ford - Category: Car.Product: Ford - Category: Car.

As previously discussed, if __str__() is not defined for the class, the object will invoke the class's __repr__() method. That's why we got the string even though we called __str__(), print(), and str() on the object data.

Calling str() and repr() on built-in class

So far, we've used user-defined classes to implement the __str__() and __repr__() methods. Now we'll look at how these methods are implemented in Python's built-in classes.

We'll see what happens when we call the __str__() and __repr__() methods on the Python built-in datetime module's classes.

import datetimetoday = datetime.datetime.today()print(f'Normal: {today}')print('-'*20)print(f'__str__ method: {today.__str__()}')print(f'str() method: {str(today)}')print('-'*20)print(f'__repr__ method: {today.__repr__()}')print(f'repr() method: {repr(today)}')

We used the __str__(), str(), __repr__(), and repr() methods on the variable today to print the current date and time.

Output

Normal: 2023-04-12 18:16:27.991308--------------------__str__ method: 2023-04-12 18:16:27.991308str() method: 2023-04-12 18:16:27.991308--------------------__repr__ method: datetime.datetime(2023, 4, 12, 18, 16, 27, 991308)repr() method: datetime.datetime(2023, 4, 12, 18, 16, 27, 991308)

The output shows that the __str__() and str() methods returned an informal or easily readable string representation, whereas the __repr__() and repr() methods returned a more informative or formal string representation.

The __str__() method is implemented by default in pre-defined or built-in Python classes, so we don't need to define them explicitly.

Conclusion

The methods __str__() and __repr__() are used to represent objects in string format. The __str__() method returns a human-readable or informal string representation of the object, whereas the __repr__() method returns a more informative or formal string representation of the object.

The __str__() method is for users, whereas the __repr__() method is for developers because it is more useful when debugging.

For pre-defined or built-in Python classes, we don't need to define the __str__() method explicitly because it is implemented by default, whereas for user-defined Python classes, we must define a custom __str__() method to get the string representation of the object; otherwise, it will return the object's memory address.

The address of the object is returned by the default implementation of the __str__() and __repr__() functions on the object of the user-defined Python class. To return the string representation of the object, the custom __str__() and __repr__() methods must be defined within the class, and if __str__() is not defined, the object will call the __repr__() method.

🏆Other articles you might be interested in if you liked this one

How Python sort() and sorted() methods are different and how they are used?

How to move and locate the file pointer using seek() and tell() in Python?

All about Python's ABC - What is it and how to use it?

Generate and manipulate the temporary files using tempfile in Python.

How to use the super() function in Python classes?

Get started with FastAPI - A detailed guide.

Display images on the frontend using FastAPI.

That's all for now

Keep Coding

Difference Between sort() And sorted() In Python

Sachin Pal — Thu, 06 Apr 2023 16:21:32 GMT

We'll compare the Python's list sort() and sorted() functions in this article. We'll learn about these functions, such as what they are and how they differ programmatically and syntactically.

Python sort() and sorted() are used to sort the data in ascending or descending order. Their goals are the same but are used in different conditions.

sort()

The sort() function is connected to the Python list and by default, sorts the list's contents in ascending order.

# List of namesdata = ['Sachin', 'Yogesh', 'Yashwant', 'Rishu']print(f'Original Data: {data}')# Using list.sort() functiondata.sort()# Printing sorted dataprint(f'Sorted Data: {data}')----------Original Data: ['Sachin', 'Yogesh', 'Yashwant', 'Rishu']Sorted Data: ['Rishu', 'Sachin', 'Yashwant', 'Yogesh']

The code above sorts the names in ascending order within the list data. That was the most fundamental use of the list sort() function. We'll see more examples where we'll manipulate the function's parameters.

Syntax

list.sort(reverse=False, key=None)

Here

reverse - Defaults to False. If reverse=True, the data will be sorted in descending order.

key - Defaults to None. We can specify a user-defined function to customize the sorting.

Examples

We'll play around with these parameters and try with data having different data types to get a better understanding of this function.

Example 1 - List having different characters

# Original Listtyp_data = ['$', '45', '3j+5', 'Hello']print(f'Original Data: {typ_data}')# Sorting the listtyp_data.sort()# Printing sorted listprint(f'Sorted Data: {typ_data}')----------Original Data: ['$', '45', '3j+5', 'Hello']Sorted Data: ['$', '3j+5', '45', 'Hello']

Example 2 - List containing the different data types

In the following example, data is a list that contains tuples and we sorted the data in descending order.

# List containing tuple datadata = [    ('Hey', 'There'),    ('Geeks', 'Welcome'),    ('To', 'GeekPython')]# Printing the type of list itemprint(f'Type: {type(data[0])}')# Sorting the data in descending orderdata.sort(reverse=True)# Printing modified dataprint(data)----------Type: <class 'tuple'>Sorted Data: [('To', 'GeekPython'), ('Hey', 'There'), ('Geeks', 'Welcome')]

What if we specify a similar name for the first values of the tuple?

# List containing tuple data with first values having similar namedata = [    ('item', 2),    ('item', 1),    ('item', 3)]# Printing the type of list itemprint(f'Type: {type(data[0])}')# Sorting the data in descending orderdata.sort(reverse=True)# Printing modified dataprint(f'Sorted Data: {data}')----------Type: <class 'tuple'>Sorted Data: [('item', 3), ('item', 2), ('item', 1)]

The tuples were sorted based on the second value because sort() cannot perform sorting on similar values, or more specifically, how can data be sorted in ascending or descending order if they are all similar?

Example 3 - Using a user-defined function

# List containing dictionary datadata = [    {'fruit': 'strawberry', 'price': 100},    {'fruit': 'banana', 'price': 91},    {'fruit': 'mango', 'price': 132},    {'fruit': 'cherry', 'price': 82},]print(f'Original Data: {data}')# Function for sorting by key 'price'def sort_dict_by_price(item):    return item['price']# Sorting data using the user-defined sorting functiondata.sort(key=sort_dict_by_price)print('-'*20)# Printing the dataprint(f'Sorted Data: {data}')

We've written a function called sort_dict_by_price that takes a parameter item, which is our dictionary itself, and returns the values of the key 'price'.

This function was passed to the key parameter, which will sort the data based on the price in ascending order.

Output

Original Data: [{'fruit': 'strawberry', 'price': 100}, {'fruit': 'banana', 'price': 91}, {'fruit': 'mango', 'price': 132}, {'fruit': 'cherry', 'price': 82}]--------------------Sorted Data: [{'fruit': 'cherry', 'price': 82}, {'fruit': 'banana', 'price': 91}, {'fruit': 'strawberry', 'price': 100}, {'fruit': 'mango', 'price': 132}]

Instead of explicitly defining the sort_dict_by_price function, we could have used the lambda function in the above code.

# List containing dictionary datadata = [    {'fruit': 'strawberry', 'price': 100},    {'fruit': 'banana', 'price': 91},    {'fruit': 'mango', 'price': 132},    {'fruit': 'cherry', 'price': 82},]print(f'Original Data: {data}')# Sorting data using the lambda functiondata.sort(key=lambda item: item['fruit'])print('-' * 20)# Printing the dataprint(f'Sorted Data: {data}')

We changed the code above and passed the lambda function to the key. The expression lambda item: item['fruit'] is equivalent to the previous code's sort_dict_by_price function.

Output

Original Data: [{'fruit': 'strawberry', 'price': 100}, {'fruit': 'banana', 'price': 91}, {'fruit': 'mango', 'price': 132}, {'fruit': 'cherry', 'price': 82}]--------------------Sorted Data: [{'fruit': 'banana', 'price': 91}, {'fruit': 'cherry', 'price': 82}, {'fruit': 'mango', 'price': 132}, {'fruit': 'strawberry', 'price': 100}]

Example 4 - Sorting tuple data by specifying sorting criteria

In the following example, data is a list containing tuples and we sorted the data in descending order based on the first items of the tuple.

# List containing tuple datadata = [    ('strawberry', 100),    ('banana', 91),    ('mango', 132),    ('cherry', 82),]print(f'Original Data: {data}')# Sorting data based on first value in descending orderdata.sort(key=lambda item: item[0], reverse=True)print('-' * 20)# Printing the dataprint(f'Sorted Data: {data}')

Output

Original Data: [('strawberry', 100), ('banana', 91), ('mango', 132), ('cherry', 82)]--------------------Sorted Data: [('strawberry', 100), ('mango', 132), ('cherry', 82), ('banana', 91)]

sorted()

Python sorted() function is used to sort the iterable data. By default, this function sorts the data in ascending order.

# Tuple datatuple_data = ((4, 's'), (1, 'q'), (3, 'z'), (4, 'a'))print(f'Original: {tuple_data}')# Using sorted functionsorting = sorted(tuple_data)print('-'*20)# Printing the sorted dataprint(f'Sorted: {sorting}')

We have nested tuple data stored in the variable tuple_data, used the sorted() function with our iterable as a parameter, and then printed the sorted data.

Output

Original: ((4, 's'), (1, 'q'), (3, 'z'), (4, 'a'))--------------------Sorted: [(1, 'q'), (3, 'z'), (4, 'a'), (4, 's')]

The data were sorted based on the first item of the tuple tuple_data.

Syntax

sorted(iterable, key=None, reverse=False)

Here

iterable - Required. Any iterable data

key - defaults to None. To specify the sorting criteria.

reverse - Defaults to False. When set to True, the data will be sorted in descending order.

Examples

We have data of various data types that are all iterable; we'll sort them using the sorted() function.

list_data = [43, 21, 2, 34]print(f'Sorted List: {sorted(list_data)}')# Seperatorprint('-'*20)tuple_data = (('x', 3), ('w', 1), ('1', 4))print(f'Sorted Tuple: {sorted(tuple_data)}')# Seperatorprint('-'*20)dict_data = {9: 'G', 1: 'V', 4: 'E'}print(f'Sorted Dictionary Keys: {sorted(dict_data)}')print(f'Sorted Dictionary Values: {sorted(dict_data.values())}')print(f'Sorted Dictionary Items: {sorted(dict_data.items())}')

Output

Sorted List: [2, 21, 34, 43]--------------------Sorted Tuple: [('1', 4), ('w', 1), ('x', 3)]--------------------Sorted Dictionary Keys: [1, 4, 9]Sorted Dictionary Values: ['E', 'G', 'V']Sorted Dictionary Items: [(1, 'V'), (4, 'E'), (9, 'G')]

Example 2 - Using key and reverse parameters

# Tuple datatuple_data = (    ('Mango', 25),    ('Walnut', 65),    ('Cherry', 10),    ('Apple', 68),)print(f'Original: {tuple_data}')# Separatorprint('-'*20)# Function for grabbing 2nd item from the datadef sorting_tup_data(item):    return item[1]# Sorting based on sorting criteria in descending ordersorting = sorted(tuple_data, key=sorting_tup_data, reverse=True)print(f'Sorted: {sorting}')

Output

Original: (('Mango', 25), ('Walnut', 65), ('Cherry', 10), ('Apple', 68))--------------------Sorted: [('Apple', 68), ('Walnut', 65), ('Mango', 25), ('Cherry', 10)]

Due to the use of the key parameter where we passed the custom function, the tuple was sorted based on the second item and data was sorted in descending order because the reverse was set to True.

Difference

sort()	sorted()
Used to sort the Python List.	Used to sort any iterable data such as List, Tuple, Dictionary, and more.
Takes two parameters: `key` and `reverse`.	Takes three parameters: `iterable`, `key` and `reverse`.
It is a `List` function(`list.sort()`) and can only work with Lists.	It is a function to sort any data which can be iterated.
`sort()` modifies the original list.	`sorted()` doesn't modify the original data instead it returns the new modified data.

Conclusion

We've seen a comparison of the list sort() and sorted() functions. We've coded the examples to understand how these functions work. Both functions are used to sort data, but the sort() function only sorts Python lists, whereas the sorted() function sorts iterable data.

We've also seen the differences between the two in a table format.

🏆Other articles you might be interested in if you liked this one

Comparing the list reverse and reversed functions.

8 different ways to reverse a Python list.

NumPy argmax() and TensorFlow argmax() - Are they similar?

Execute your code dynamically using the exec() in Python.

Perform high-level file operations on files in Python.

Number your iterable data using the enumerate() in Python.

Understanding args and kwargs in function parameter in Python.

That's all for now

Keep Coding

Difference Between seek() & tell() And How To Use

Sachin Pal — Fri, 31 Mar 2023 16:11:01 GMT

Python provides methods and functions to handle files. Handling files includes operations like opening a file, after that reading the content, adding or overwriting the content and then finally closing the file.

https://geekpython.in/handling-files-in-python

By using the read() function or by iterating over each line we can read the content of the file. The point of saying this is that there are several ways to present the program's output.

What if we want to read the file from a specific position or find out from where the reading begins? Python's seek() and tell() functions come in handy here.

In this article, we'll look at the differences and applications of Python's seek() and tell() functions.

Sample File

We'll be using the sample.txt file, which contains the information shown in the image below.

seek

The seek() function in Python is used to move the file cursor to the specified location. When we read a file, the cursor starts at the beginning, but we can move it to a specific position by passing an arbitrary integer (based on the length of the content in the file) to the seek() function.

# Opening a file for readingwith open('sample.txt', 'r') as file:    # Setting the cursor at 62nd position    file.seek(62)    # Reading the content after the 62nd character    data = file.read()    print(data)

We've moved the cursor to the 62nd position, which means that if we read the file, we'll begin reading after the 62nd character.

Syntax

seek(offset, whence)

Here,

offset - Required parameter. Sets the cursor to the specified position and starts reading after that position.

whence - Optional parameter. It is used to set the point of reference to start from which place.

0 - Default. Sets the point of reference at the beginning of the file. Equivalent to os.SEEK_SET.

1 - Sets the point of reference at the current position of the file. Equivalent to os.SEEK_CUR.

2 - Sets the point of reference at the end of the file. Equivalent to os.SEEK_END.

Note: We cannot set the point of reference 1 or 2 when a file is opened in text mode, but we can specify 1 or 2 when the offset is set to 0.

More examples

We'll see examples where we can experiment with these parameter values to better understand how they work.

Example 1 - Seek relative to the current position in text mode

# Opening a file for readingwith open('sample.txt', 'r') as file:    # Setting the cursor at 62nd position and sets the reference point equal to 1    file.seek(62, 1)    # Reading the content after the 62nd character    data = file.read()    print(data)

We opened the file in text mode, moved the cursor to the 62nd position, and set the reference point as the file's current position. Then we attempted to open the file.

Traceback (most recent call last):  ....    file.seek(62, 1)io.UnsupportedOperation: can't do nonzero cur-relative seeks

Python returned an error message stating that this operation is not supported because seek relative to current position cannot be performed with a number other than 0. If we specify a reference point equal to 2, the result will be the same.

However, if we had opened the file in binary mode, the above code would have been executed without error.

Example 2 - Seek relative to the current position in binary mode

# Opening a file for reading in binary modewith open('sample.txt', 'rb') as file:    """Setting the cursor at 62nd position and     setting the reference point equal to 1"""    file.seek(62, 1)    # Reading the content after the 62nd character    data = file.read()    print(data)

Output

b'Its design philosophy emphasizes code readability with the use of significant indentation.'

Example 3 - Specifying the negative offset value.

# Opening a file for reading in binary modewith open('sample.txt', 'rb') as file:    # Setting the cursor at 25th position from the end    file.seek(-25, 2)    # Reading the content    data = file.read()    print(data)----------b' significant indentation.'

The reference point is at the 25th position from the end. We got the output characters up to the 25th position to the left of the file's end.

Example 4

# Opening a file for reading in binary modewith open('sample.txt', 'rb') as file:    # Setting the cursor at 50th position    file.seek(50)    # Moving back 10 chars from the current position    file.seek(-10, 1)    # Reading the content    data = file.read()    print(data)

Output

b'programming language. Its design philosophy emphasizes code readability with the use of significant indentation.'

tell

The seek() function is used to set the position of the file cursor, whereas the tell() function returns the position where the cursor is set to begin reading.

# Opening a filewith open('sample.txt', 'r') as file:    # Using tell() function    pos = file.tell()    # Printing the position of the cursor    print(f'File cursor position: {pos}')    # Printing the content    data = file.read()    print(data)

Output

File cursor position: 0Python is a high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation.

Syntax

tell()

The tell() function takes no parameter.

Examples

As previously stated, we can use the tell() function to return the cursor position set by the seek() function. To determine the position of the cursor, we'll experiment with the seek() function parameter values.

Example 1 - Setting cursor position from the end and printing the cursor position

# Opening a file in binary modewith open('sample.txt', 'rb') as file:    # Setting cursor at the 25th pos from the end    file.seek(-25, 2)    # Using tell() function    pos = file.tell()    # Printing the position of the cursor    print(f'File cursor position: {pos}')    # Printing the content    data = file.read()    print(data)

Output

File cursor position: 127b' significant indentation.

The character at the 25th position from the end begins after the 127th character from the beginning, that's why we got the cursor position as 127.

Example 2

# Opening a file in binary modewith open('sample.txt', 'rb') as file:    # Setting cursor at the 100th pos    file.seek(100)    print(f'File cursor position before: {file.tell()}')    # Moving back 15 chars from the current position    file.seek(-15, 1)    # Using tell() function    pos = file.tell()    # Printing the position of the cursor    print(f'File cursor position now: {pos}')    # Printing the content    data = file.read()    print(data)

Output

File cursor position before: 100File cursor position now: 85b'mphasizes code readability with the use of significant indentation.'

Difference

seek()	tell()
Used to set the file cursor to the specific position.	Used to tell the position of the file cursor.
Takes two parameters: the first is `offset` and the second is `whence`.	Takes no parameter
By using `seek()` function, we can manipulate the reading position of the file's content.	By using the `tell()` function, we can only get the position of the file cursor.

Conclusion

The article discussed the differences between seek() and tell() functions, as well as how to use them effectively to handle file data. These two functions are completely distinct from one another.

🏆Other articles you might be interested in if you liked this one

Generate temporary files and directories using tempfile in Python.

Creating and implementing abstract classes in Python.

How to use getitem, setitem and delitem in Python.

How to use super() function in Python classes.

How to integrate the PostgreSQL database with Python?

Build your command line interface in a few steps.

Perform high-level file operation using the shutil in Python.

That's all for now

Keep Coding

Python's ABC: Understanding the Basics of Abstract Base Classes

Sachin Pal — Mon, 27 Mar 2023 15:45:57 GMT

What is the ABC of Python? It stands for the abstract base class and is a concept in Python classes based on abstraction. Abstraction is an integral part of object-oriented programming.

Abstraction is what we call hiding the internal process of the program from the users. Take the example of the computer mouse where we click the left or right button and something respective of it happens or scroll the mouse wheel and a specific task happens. We are unaware of the internal functionality but we do know that clicking this button will do our job.

In abstraction, users are unaware of the internal functionality but are familiar with the purpose of the method. If we take an example of a datetime module, we do know that running the datetime.now() function will return the current date and time but are unaware of how this happens.

ABC in Python

Python is not a fully object-oriented programming language but it supports the features like abstract classes and abstraction. We cannot create abstract classes directly in Python, so Python provides a module called abc that provides the infrastructure for defining the base of Abstract Base Classes(ABC).

What are abstract base classes? They provide a blueprint for concrete classes. They are just defined but not implemented rather they require subclasses for implementation.

Defining ABC

Let's understand with an example how we can define an abstract class with an abstract method inside it.

from abc import ABC, abstractmethodclass Friend(ABC):    @abstractmethod    def role(self):        pass

The class Friend derived from the ABC class from the abc module that makes it an abstract class and then within the class, a decorator @abstractmethod is defined to indicate that the function role is an abstract method.

abc module has another class ABCMeta which is used to create abstract classes.

from abc import ABCMeta, abstractmethodclass Friend(metaclass=ABCMeta):    @abstractmethod    def role(self):        pass

ABC in action

Now we'll see how to make an abstract class and abstract method and then implement it inside the concrete classes through inheritance.

from abc import ABC, abstractmethod# Defining abstract classclass Friends(ABC):    """abstract method decorator to indicate that the    method right below is an abstract method."""    @abstractmethod    def role(self):        pass# Concrete derived class inheriting from abstract classclass Sachin(Friends):    # Implementing abstract method    def role(self):        print('Python Developer')# Concrete derived class inheriting from abstract classclass Rishu(Friends):    # Implementing abstract method    def role(self):        print('C++ Developer')# Concrete derived class inheriting from abstract classclass Yashwant(Friends):    # Implementing abstract method    def role(self):        print('C++ Developer')# Instantiating concrete derived classroles = Sachin()roles.role()# Instantiating concrete derived classroles = Rishu()roles.role()# Instantiating concrete derived classroles = Yashwant()roles.role()

In the above code, we created an abstract class called Friends and defined the abstract method role within the class using the @abstractmethod decorator.

Then we created three concrete derived classes, Sachin, Rishu, and Yashwant, that inherits from the class Friends and implemented the abstract method role within them.

Then we instantiated the derived classes and called the role using the instance of the classes to display the result.

Python DeveloperC++ DeveloperC++ Developer

As we discussed earlier, abstract classes provide a blueprint for implementing methods into concrete subclasses.

from abc import ABC, abstractmethodclass Details(ABC):    @abstractmethod    def getname(self):        return self.name    @abstractmethod    def getrole(self):        return self.role    @abstractmethod    def gethobby(self):        return self.hobbyclass Sachin(Details):    def __init__(self, name="Sachin"):        self.name = name        self.role = "Python Dev"        self.hobby = "Fuseball spielen"    def getname(self):        return self.name    def getrole(self):        return self.role    def gethobby(self):        return self.hobbydetail = Sachin()print(detail.getname())print(detail.getrole())print(detail.gethobby())

In the above code, we created a blueprint for the class Details. Then we created a class called Sachin that inherits from Details, and we implemented the methods according to the blueprint.

Then we instantiated the class Sachin and then printed the values of getname, getrole and gethobby.

SachinPython DevFuseball spielen

What if we create a class that doesn't follow the abstract class blueprint?

from abc import ABC, abstractmethodclass Details(ABC):    @abstractmethod    def getname(self):        return self.name    @abstractmethod    def getrole(self):        return self.role    @abstractmethod    def gethobby(self):        return self.hobbyclass Sachin(Details):    def __init__(self, name="Sachin"):        self.name = name        self.role = "Python Dev"        self.hobby = "Fuseball spielen"    def getname(self):        return self.name    def getrole(self):        return self.roledetail = Sachin()print(detail.getname())print(detail.getrole())

Python will raise an error upon executing the above code because the class Sachin doesn't follow the class Details blueprint.

Traceback (most recent call last):  ....TypeError: Can't instantiate abstract class Sachin with abstract method gethobby

Use of ABC

As we saw in the above example that if a derived class doesn't follow the blueprint of the abstract class, then the error will be raised.

That's where ABC(Abstract Base Class) plays an important role in making sure that the subclasses must follow that blueprint. Thus we can say that the subclasses inherited from the abstract class must follow the same structure and implements the abstract methods.

Concrete methods inside ABC

We can also define concrete methods within the abstract classes. The concrete method is the normal method that has a complete definition.

from abc import ABC, abstractmethod# Abstract classclass Demo(ABC):    # Defining a concrete method    def concrete_method(self):        print("Calling concrete method")    @abstractmethod    def data(self):        pass# Derived classclass Test(Demo):    def data(self):        pass# Instantiating the classdata = Test()# Calling the concrete methoddata.concrete_method()

In the above code, concrete_method is a concrete method defined within the abstract class and the method was invoked using the instance of the class Test.

Calling concrete method

Abstract class instantiation

An abstract class can only be defined but cannot be instantiated because of the fact that they are not a concrete class. Python doesn't allow creating objects for abstract classes because there is no actual implementation to invoke rather they require subclasses for implementation.

from abc import ABC, abstractmethodclass Details(ABC):    @abstractmethod    def getname(self):        return self.name    @abstractmethod    def getrole(self):        return self.roleclass Sachin(Details):    def __init__(self, name="Sachin"):        self.name = name        self.role = "Python Dev"    def getname(self):        return self.name    def getrole(self):        return self.role# Instantiating the abstract classabstract_class = Details()

Output

Traceback (most recent call last):  ....TypeError: Can't instantiate abstract class Details with abstract methods getname, getrole

We got the error stating that we cannot instantiate the abstract class Details with abstract methods called getname and getrole.

Abstract property

Just as the abc module allows us to define abstract methods using the @abstractmethod decorator, it also allows us to define abstract properties using the @abstractproperty decorator.

from abc import ABC, abstractproperty# Abstract classclass Hero(ABC):    @abstractproperty    def hero_name(self):        return self.hname    @abstractproperty    def reel_name(self):        return self.rname# Derived classclass RDJ(Hero):    def __init__(self):        self.hname = "IronMan"        self.rname = "Tony Stark"    @property    def hero_name(self):        return self.hname    @property    def reel_name(self):        return self.rnamedata = RDJ()print(f'The hero name is: {data.hero_name}')print(f'The reel name is: {data.reel_name}')

We created an abstract class Hero and defined two abstract properties called hero_name and reel_name using @abstractproperty. Then, within the derived class RDJ, we used the @property decorator to implement them that will make them a getter method.

Then we instantiated the class RDJ and used the class instance to access the values of hero_name and reel_name.

The hero name is: IronManThe reel name is: Tony Stark

If we had not placed the @property decorator inside the class RDJ, we would have had to call the hero_name and reel_name.

from abc import ABC, abstractproperty# Abstract classclass Hero(ABC):    @abstractproperty    def hero_name(self):        return self.hname    @abstractproperty    def reel_name(self):        return self.rname# Derived classclass RDJ(Hero):    def __init__(self):        self.hname = "IronMan"        self.rname = "Tony Stark"    def hero_name(self):        return self.hname    def reel_name(self):        return self.rnamedata = RDJ()print(f'The hero name is: {data.hero_name()}')print(f'The reel name is: {data.reel_name()}')----------The hero name is: IronManThe reel name is: Tony Stark

Note: abstractproperty is a deprecated class, instead we can use @property with @abstractmethod to define an abstract property. Pycharm IDE gives a warning upon using the abstractproperty class.

The above code will look like the following if we modify it.

# Abstract classclass Hero(ABC):    @property    @abstractmethod    def hero_name(self):        return self.hname    @property    @abstractmethod    def reel_name(self):        return self.rname

Just apply the modifications as shown in the above code within the abstract class and run the code. The code will run without any error as earlier.

Conclusion

We've covered the fundamentals of abstract classes, abstract methods, and abstract properties in this article. Python has an abc module that provides infrastructure for defining abstract base classes.

The ABC class from the abc module can be used to create an abstract class. ABC is a helper class that has ABCMeta as its metaclass, and we can also define abstract classes by passing the metaclass keyword and using ABCMeta. The only difference between the two classes is that ABCMeta has additional functionality.

After creating the abstract class, we used the @abstractmethod and (@property with @abstractmethod) decorators to define the abstract methods and abstract properties within the class.

To understand the theory, we've coded the examples alongside the explanation.

🏆Other articles you might be interested in if you liked this one

What are class inheritance and different types of inheritance in Python?

Learn the use cases of the super() function in Python classes.

How underscores modify accessing the attributes and methods in Python.

What are __init__ and __call__ in Python.

Implement a custom deep learning model into the Flask app for image recognition.

Train a custom deep learning model using the transfer learning technique.

How to augment data using the existing data in Python.

That's all for now

Keep Coding

Handling Files In Python - Opening, Reading & Writing

Sachin Pal — Tue, 21 Mar 2023 12:47:20 GMT

Files are used to store information, and when we need to access the information, we open the file and read or modify it. We can use the GUI to perform these operations in our systems.

Many programming languages include methods and functions for managing, reading, and even modifying file data. Python is one of the programming languages that can handle files.

In this article, we'll look at how to handle files, which includes the methods and operations for reading and writing files, as well as other methods for working with files in Python. We'll also make a project to adopt a pet and save the entry in the file.

Primary operation - Opening a file

We must first open a file before we can read or write to it. Use Python's built-in open() function to perform this operation on the file.

Here's an example of how to use Python's open() function.

file_obj = open('file.txt', mode='r')

We specified two parameters. The first is the file name, and the second is the mode, which determines the mode in which we want to open the file.

There are several modes to open files, and the most commonly used ones are listed below.

r - Read. The file opens in read mode. Shows an error if the file doesn't exist.

w - Write. The file opens in write mode. Creates a file if the file doesn't exist.

a - Append. The file opens in append mode. Writes data into an existing file.

x - Create. Creates a file. If the specified file exists, returns an error.

To specify in which mode a file should be handled, we can use

t - Text. The file opens in text mode which means the file will store text data. It is a default mode.

b - Binary. The file opens in the binary mode which means the file will store binary data.

Hence, the primary task is to open a file on which you must operate and specify the mode specific to the work you wish to perform on the file.

Reading a file

We can open the file reading mode after specifying the file name to the open() function. This is the default value; if the mode parameter is not specified, the file will open in the default read and text mode.

To perform this operation, we have a file called text_file.txt which has some content inside it.

# Using open() functionfile = open('text_file.txt', 'r')# Printing the content inside the filefor content in file:    print(content)----------Hey, there Geeks. Welcome to GeekPython.

We successfully read the content inside the file by iterating them using the Python for loop.

Using read()

We can use file.read() to read the characters from the file. Here's a code that will read the content from the file.

# Using open() functionfile = open('text_file.txt', 'r')# Reading file using read()print(file.read())----------Hey, there Geeks. Welcome to GeekPython.

Closing the file

When we're done working with files, we need to close them so that the resources associated with them can be released. After working with a file, it is best practice to close it.

We can use file.close() to close the current working file.

# Using open() functionfile = open('text_file.txt', 'r')# Reading file using read()print(file.read())# Closing the filefile.close()# Checking if the file is closed or notprint(file.closed)----------Hey, there Geeks. Welcome to GeekPython.True

The code returned True which means the file was closed successfully.

Using with keyword

Python's open() is a context manager, so we can use with keyword with it to open and read the file's content.

# Using with keywordwith open('text_file.txt', 'r') as file:    # Printing the content    print(file.read())

Here, open() will open the file and returns the file object and then the as keyword will bind the returned value to the file. We can now use file to print the content. Then the file will be automatically closed.

Writing into file

As we saw in the upper section, if we want to write data into the file then we need to use the 'w' mode. But before writing data into the file, remember that

If the specified file does not exist, a new one will be created.
If the specified file already exists, the data within it will be replaced with new data.

In the following code, we'll create a new file and then write some content inside it.

# Creating and writing in a filewith open('test.txt', 'w') as file:    # Writing data inside the file    file.write('You guys are awesome, Thanks for reading.')    print('Data written successfully.')    # Closing the file    file.close()----------Data written successfully.

The file name test.txt will be created and the content will be written inside it.

What do you think will happen if we try to add more content inside the test.txt file and run the above code?

# Writing in a filewith open('test.txt', 'w') as file:    # Writing data inside the file    file.write('If you love this, Bookmark it and share this.')    print('Data written successfully.')    # Closing the file    file.close()----------Data written successfully.

The previous content will be erased and the new content will be written.

Appending data to the file

We can append data to files by opening them in 'a' mode. This will not overwrite our existing content but rather add new data to the file.

# Adding new data to the filewith open('test.txt', 'a') as file:    # Writing new data    file.write('\nPlease check other articles on GeekPython.')    print('Data written successfully.')    # Closing the file    file.close()----------Data written successfully.

The test.txt file will be opened in append mode, and the content will be appended to it without overwriting any existing data.

Using readlines()

Now that we have some content stored in two different lines in our file, it's a good time to demonstrate the readlines() function.

with open('test.txt', 'r') as file:    # Reading first line    print(file.readline())    # Reading second line    print(file.readline())# Closing the filefile.close()----------If you love this, Bookmark it and share this.Please check other articles on GeekPython.

The readlines() function allows us to read the content of the file line by line. Our test.txt file contains two lines so we used the readlines() function twice.

Reading & writing binary data

Binary('b') mode is available for reading and writing binary data. We must specify 'rb' to read the binary content from the file, and similarly, 'wb' to write binary data into the file.

# Opening an image file in read binary modewith open('writing.png', 'rb') as f:    # Reading the bytes of the image    print(f.read())    f.close()

We'll get data as shown in the image below.

The following code will show how we can write binary data into the file.

# Opening image in write binary modewith open('writing.png', 'wb') as binary_file:    # Overriding the bytes of the image    binary_file.write(b'Hello there')    # Closing the file    binary_file.close()

We overrode the image file's previous content and added new content. Now, if we read the image's bytes, we'll get the output shown below.

The image file is now corrupted and we won't be able to open this image.

Exception handling - try... finally

When we encounter errors or exceptions while performing a specific operation on a file, the program exits without closing the file. As a result, we must take precautions to address this issue.

We can use a try-finally block to handle the exception. The following code shows how to use it.

try:    file = open('test.txt', 'w')    # Writing content    write_data = file.write("Hey, how it's going.")    # Trying to read the content    read_data = file.read()    print(read_data)finally:    # Closing the file    file.close()    # Printing if file is closed    print(file.closed)

The above code will return an error stating that the read operation is not supported, but instead of terminating immediately, the program will run the block of code written within the finally block, resulting in the file being closed.

Traceback (most recent call last):  ....    read_data = file.read()io.UnsupportedOperation: not readableTrue

Bonus - Pet Adoption project

https://gist.github.com/Sachin-crypto/e0eef715640b44f8853f3af628e718ee

Here's a simple Python program that prompts the user to select which operations they want to perform after they enter their name. When a user performs an adoption operation, the name is saved in the file along with the current timestamp. The user can view the information stored in the file by performing the information operation.

Go ahead and run it in your code editor and also try to make a better version of it.

Conclusion

With code examples, we learned the fundamental operations of creating, reading, writing, and closing files in this article. To perform these operations, the file must first be opened, and then the mode specific to the operation we want to perform on the file must be specified.

If no mode is specified, read and text mode is used to open a file, and we can also write and read binary data from the file.

We've also written a Python program in which a user can adopt a pet for themselves, and the data is saved in a file.

🏆Other articles you might be interested in if you liked this one

How to open and read multiple files simultaneously in Python.

Reading and writing zip files without extracting them in Python.

Perform high-level file operations on the file using the shutil module in Python.

Take multiple inputs from the user in a single line in Python.

enumerate() function in Python.

How to use async-await in Python.

Perform a parallel iteration over multiple iterables using zip() function in Python.

That's all for now

Keep Coding

How to implement getitem, setitem, and delitem in Python

Sachin Pal — Thu, 16 Mar 2023 16:03:33 GMT

Python has numerous collections of dunder methods(which start with double underscores and end with double underscores) to perform various tasks. The most commonly used dunder method is __init__ which is used in Python classes to create and initialize objects.

In this article, we'll see the usage and implementation of the underutilized dunder methods such as __getitem__, __setitem__, and __delitem__ in Python.

getitem

The name getitem depicts that this method is used to access the items from the list, dictionary and array.

If we have a list of names and want to access the item on the third index, we would use name_list[3], which will return the name from the list on the third index. When the name_list[3] is evaluated, Python internally calls __getitem__ on the data (name_list.__getitem__(3)).

The following example shows us the practical demonstration of the above theory.

# List of namesmy_list = ['Sachin', 'Rishu', 'Yashwant', 'Abhishek']# Accessing items using bracket notationprint('Accessed items using the bracket notation')print(my_list[0])print(my_list[2], "\n")# Accessing items using __getitem__print('Accessed items using the __getitem__')print(my_list.__getitem__(1))print(my_list.__getitem__(3))

We used the commonly used bracket notation to access the items from the my_list at the 0th and 2nd index and then to access the items at the 1st and 3rd index, we implemented the __getitem__ method.

Accessed items using the bracket notationSachinYashwant Accessed items using the __getitem__RishuAbhishek

Syntax

__getitem__(self, key)

The __getitem__ is used to evaluate the value of self[key] by the object or instance of the class. Just like we saw earlier, object[key] is equivalent to object.__getitem__(key).

self - object or instance of the class

key - value we want to access

getitem in Python classes

# Creating a classclass Products:    def __getitem__(self, items):        print(f'Item: {items}')item = Products()item['RAM', 'ROM']item[{'Storage': 'SSD'}]item['Graphic Card']

We created a Python class named Products and then defined the __getitem__ method to print the items. Then we created an instance of the class called item and then passed the values.

Item: ('RAM', 'ROM')Item: {'Storage': 'SSD'}Item: Graphic Card

These values are of various data types and were actually parsed, for example, item['RAM', 'ROM'] was parsed as a tuple and this expression was evaluated by the interpreter as item.__getitem__(('RAM', 'ROM')).

Checking the type of the item along with the items.

import math# Creating a classclass Products:    # Printing the types of item along with items    def __getitem__(self, items):        print(f'Item: {items}. Type: {type(items)}')item = Products()item['RAM', 'ROM']item[{'Storage': 'SSD'}]item['Graphic Card']item[math]item[89]

Output

Item: ('RAM', 'ROM'). Type: 'tuple'>Item: {'Storage': 'SSD'}. Type: 'dict'>Item: Graphic Card. Type: 'str'>Item: 'math' (built-in)>. Type: 'module'>Item: 89. Type: 'int'>

Example

In the following example, we created a class called Products, an __init__ that takes items and a price, and a __getitem__ that prints the value and type of the value passed inside the indexer.

Then we instantiated the class Products and passed the arguments 'Pen' and 10 to it, which we saved inside the obj. Then, using the instance obj, we attempted to obtain the values by accessing the parameters items and price.

# Creating a classclass Products:    # Creating a __init__ function    def __init__(self, items, price):        self.items = items        self.price = price    def __getitem__(self, value):        print(value, type(value))# Creating instance of the class and passing the valuesobj = Products('Pen',10)# Accessing the valuesobj[obj.items]obj[obj.price]

Output

Pen 'str'>10 'int'>

setitem

The __setitem__ is used to assign the values to the item. When we assign or set a value to an item in a list, array, or dictionary, this method is called internally.

Here's an example in which we created a list of names, and attempted to modify the list by changing the name at the first index (my list[1] = 'Yogesh'), and then printed the updated list.

To demonstrate what the interpreter does internally, we modified the list with the help of __setitem__.

# List of namesmy_list = ['Sachin', 'Rishu', 'Yashwant', 'Abhishek']# Assigning other name at the index value 1my_list[1] = 'Yogesh'print(my_list)print('-'*20)# What interpreter does internallymy_list.__setitem__(2, 'Rishu')print(my_list)

When we run the above code, we'll get the following output.

['Sachin', 'Yogesh', 'Yashwant', 'Abhishek']--------------------['Sachin', 'Yogesh', 'Rishu', 'Abhishek']

Syntax

__setitem__(self, key, value)

The __setitem__ assigns a value to the key. If we call self[key] = value, then it will be evaluated as self.__setitem__(key, value).

self - object or instance of the class

key - the item that will be replaced

value - key will be replaced by this value

setitem in Python classes

The following example demonstrates the implementation of the __setitem__ method in a Python class.

# Creating a classclass Roles:    # Defining __init__ method    def __init__(self, role, name):        # Creating a dictionary with key-value pair        self.detail = {            'name': name,            'role': role        }    # Defining __getitem__ method    def __getitem__(self, key):        return self.detail[key]    # Function to get the role and name    def getrole(self):        return self.__getitem__('role'), self.__getitem__('name')    # Defining __setitem__ method    def __setitem__(self, key, value):        self.detail[key] = value    # Function to set the role and name    def setrole(self, role, name):        print(f'{role} role has been assigned to {name}.')        return self.__setitem__('role', role), self.__setitem__('name', name)# Instantiating the class with required argsdata = Roles('Python dev', 'Sachin')# Printing the role with nameprint(data.getrole())# Setting the role for other guysdata.setrole('C++ dev', 'Rishu')# Printing the assigned role with nameprint(data.getrole())# Setting the role for other guysdata.setrole('PHP dev', 'Yashwant')# Printing the assigned role with nameprint(data.getrole())

We created a Roles class and a __init__ function, passing the role and name parameters and storing them in a dictionary.

Then we defined the __getitem__ method, which returns the key's value, and the getrole() function, which accesses the value passed to the key name and role.

Similarly, we defined the __setitem__ method, which assigns a value to the key, and we created the setrole() function, which assigns the specified values to the key role and name.

The class Roles('Python dev,' 'Sachin') was then instantiated with required arguments and stored inside the data object. We printed the getrole() function to get the role and name, then we called the setrole() function twice, passing it the various roles and names, and printing the getrole() function for each setrole() function we defined.

('Python dev', 'Sachin')C++ dev role has been assigned to Rishu.('C++ dev', 'Rishu')PHP dev role has been assigned to Yashwant.('PHP dev', 'Yashwant')

We got the values passed as an argument to the class but after it, we set the different roles and names and got the output we expected.

delitem

The __delitem__ method deletes the items in the list, dictionary, or array. The item can also be deleted using the del keyword.

# List of namesmy_list = ['Sachin', 'Rishu', 'Yashwant', 'Abhishek']# Deleting the first item of the listdel my_list[0]print(my_list)# Deleting the item using __delitem__my_list.__delitem__(1)print(my_list)----------['Rishu', 'Yashwant', 'Abhishek']['Rishu', 'Abhishek']

In the above code, we specified the del keyword and then specified the index number of the item to be deleted from my_list.

So, when we call del my_list[0] which is equivalent to del self[key], Python will call my_list.__delitem__(0) which is equivalent to self.__delitem__(key).

delitem in Python class

class Friends:    def __init__(self, name1, name2, name3, name4):        self.n = {            'name1': name1,            'name2': name2,            'name3': name3,            'name4': name4        }    # Function for deleting the entry    def delname(self, key):        self.n.__delitem__(key)    # Function for adding/modifying the entry    def setname(self, key, value):        self.n[key] = valuefriend = Friends('Sachin', 'Rishu', 'Yashwant', 'Abhishek')print(friend.n, "\n")# Deleting an entryfriend.delname('name3')print('After deleting the name3 entry')print(friend.n, "\n")# Modifying an entryfriend.setname('name2', 'Yogesh')print('name2 entry modified')print(friend.n, "\n")# Deleting an entryfriend.delname('name2')print('After deleting the name2 entry')print(friend.n)

We defined the delname function in the preceding code, which takes a key and deletes that entry from the dictionary created inside the __init__ function, as well as the setname function, which modifies/adds the entry to the dictionary.

Then we instantiated the Friends class, passed in the necessary arguments, and stored them in an instance called friends.

Then we used the delname function to remove an entry with the key name3 before printing the updated dictionary. In the following block, we modified the entry with the key name2 to demonstrate the functionality of setname function and printed the modified dictionary, then we deleted the entry with the key name2 and printed the updated dictionary.

{'name1': 'Sachin', 'name2': 'Rishu', 'name3': 'Yashwant', 'name4': 'Abhishek'} After deleting the name3 entry{'name1': 'Sachin', 'name2': 'Rishu', 'name4': 'Abhishek'} name2 entry modified{'name1': 'Sachin', 'name2': 'Yogesh', 'name4': 'Abhishek'} After deleting the name2 entry{'name1': 'Sachin', 'name4': 'Abhishek'}

Conclusion

We learned about the __getitem__, __setitem__, and __delitem__ methods in this article. We can compare __getitem__ to a getter function because it retrieves the value of the attribute, __setitem__ to a setter function because it sets the value of the attribute, and __delitem__ to a deleter function because it deletes the item.

We implemented these methods within Python classes in order to better understand how they work.

We've seen code examples that show what Python does internally when we access, set, and delete values.

🏆Other articles you might be interested in if you liked this one

How to use and implement the __init__ and __call__ in Python.

Types of class inheritance in Python with examples.

How underscores modify accessing the attributes and methods in Python.

Create and manipulate the temporary file in Python.

Display the static and dynamic images on the frontend using FastAPI.

Python one-liners to boost your code.

Perform a parallel iteration over multiple iterables using zip() function in Python.

That's all for now

Keep Coding

How To Use tempfile To Create Temporary Files and Directories in Python

Sachin Pal — Sun, 12 Mar 2023 14:15:49 GMT

Python has a rich collection of standard libraries to carry out various tasks. In Python, there is a module called tempfile that allows us to create and manipulate temporary files and directories.

We can use tempfile to create temporary files and directories for storing temporary data during the program execution. The module has functions that allow us to create, read, and manipulate temporary files and directories.

In this article, we'll go over the fundamentals of the tempfile module, including creating, reading, and manipulating temporary files and directories, as well as cleaning up after we're done with them.

Creating temporary file

The tempfile module includes a function called TemporaryFile() that allows us to create a temporary file for use as temporary storage. Simply enter the following code to create a temporary file.

import tempfile# Creating a temporary filetemp_file = tempfile.TemporaryFile()print(f'Tempfile object: {temp_file}')print(f'Tempfile name: {temp_file.name}')# Closing the file temp_file.close()

First, we imported the module, and because it is part of the standard Python library, we did not need to install it.

Then we used the TemporaryFile() function to create a temporary file, which we saved in the temp_file variable. Then we added print statements to display the tempfile object and temporary file name.

Finally, we used the close() function to close the file, just as we would any other file. This will automatically clean up and delete the file.

Output

Tempfile object: Tempfile name: C:\Users\SACHIN\AppData\Local\Temp\tmp5jyjavv2

We got the location of the object in memory when we printed the object temp_file, and we got the random name tmp5jyjavv2 with the entire path when we printed the name of the temporary file.

We can specify dir parameter to create a temporary file in the specified directory.

import tempfile# Creating a temporary file in the cwdnamed_file = tempfile.TemporaryFile(    dir='./')print('File created in the cwd')print(named_file.name)# Closing the filenamed_file.close()

The file will be created in the current working directory.

File created in the cwdD:\SACHIN\Pycharm\tempfile_module\tmptbg9u6wk

Writing text and reading it

The TemporaryFile() function sets the mode parameter 'w+b' to the default value to read and write files in a binary mode. We can also use 'w+t' mode to write text data into the temporary file.

import tempfile# Creating a temporary filefile = tempfile.TemporaryFile()try:    # Writing into the temporary file    file.write(b"Welcome to GeekPython.")    # Position to start reading the file    file.seek(0)  # Reading will start from beginning    # Reading the file    data = file.read()    print(data)finally:    # Closing the file    file.close()----------b'Welcome to GeekPython.'

In the above code, we created a temporary file and then used the write() function to write data into it, passing the string in bytes format (as indicated by the prefix 'b' before the content).

Then we used the read() function to read the content, but first, we specified the position from which to begin reading the content using the seek(0) (start from the beginning) function, and finally, we closed the file.

Using context manager

We can also create temporary files using the context managers such as with keyword. Also, the following example will show us how to write and read text data into the temporary file.

import tempfile# Using with statement creating a temporary filewith tempfile.TemporaryFile(mode='w+t') as file:    # Writing text data into the file    file.write('Hello Geeks!')    # Specifying the position to start reading    file.seek(0)    # Reading the content    print(file.read())    # Closing the file    file.close()

In the above code, we used the with statement to create a temporary file and then opened it in text mode('w+t') and did everything the same as we do for handling other files in Python.

Hello Geeks!

Named temporary file

To take more control over making the temporary file like naming the temporary file as we want or keeping it or deleting it, then we can use NamedTemporaryFile().

When creating a temporary file, we can specify a suffix and a prefix, and we can choose whether to delete or keep the temporary file. It has a delete parameter that defaults to True, which means that the file will be deleted automatically when we close it, but we can change this value to False to keep it.

import tempfile# Creating a temporary file using NamedTemporaryFile()named_file = tempfile.NamedTemporaryFile()print('Named file:', named_file)print('Named file name:', named_file.name)# Closing the filenamed_file.close()----------Named file: 0x000001E70CDEC310>Named file name: C:\Users\SACHIN\AppData\Local\Temp\tmpk6ycz_ek

The output of the above code is the same as when we created the temporary file using the TemporaryFile() function, it's because NamedTemporaryFile() operates exactly as TemporaryFile() but the temporary file created using the NamedTemporaryFile() is guaranteed to have a visible name in the file system.

How to customize the name of a temporary file?

As you can see, the name of our temporary file is generated at random, but we can change that with the help of the prefix and suffix parameters.

import tempfile# Creating a temporary file using NamedTemporaryFile()named_file = tempfile.NamedTemporaryFile(    prefix='geek-',    suffix='-python')print('Named file name:', named_file.name)# Closing the filenamed_file.close()

The temporary file will now be created when the code is run, and its name will be prefixed and suffixed with the values we specified in the code above.

Named file name: C:\Users\SACHIN\AppData\Local\Temp\geek-n6luwuwx-python

Creating temporary directory

To create a temporary directory, we can use TemporaryDirectory() function from the tempfile module.

import tempfile# Creating a temporary directory inside the TempDirtempdir = tempfile.TemporaryDirectory(    dir='TempDir')# Printing the name of temporary dirprint('Temp Directory:', tempdir.name)# Cleaning up the dirtempdir.cleanup()

The above code will create a temporary directory inside the TempDir directory that we created in the root directory. Then we printed the name of the temporary directory and then called the cleanup() method to explicitly clean the temporary directory.

Temp Directory: TempDir\tmplfu2ooew

We can use with statement to create a temporary directory and we can also specify suffix and prefix parameters.

with tempfile.TemporaryDirectory(        dir='TempDir',        prefix='hey-',        suffix='-there') as tempdir:    print("Temp Dir:", tempdir)

When we create a temporary directory using the with statement the name of the directory will be assigned to the target of the as clause. To get the name of the temporary directory, we passed tempdir rather than tempdir.name in the print statement.

Temp Dir: TempDir\hey-q14i1hc4-there

You will notice that this time we didn't use the cleanup() method, that's because the temporary directory and its content are removed after the completion of the context of the temporary directory object.

SpooledTemporaryFile

The SpooledTemporaryFile() function is identical to TemporaryFile(), with the exception that it stores the data in memory up until the maximum size is reached or fileno() method is called.

import tempfile# Creating a temporary file to spool data into itsp_file = tempfile.SpooledTemporaryFile(max_size=10)print(sp_file)# Writing into the filesp_file.write(b'Hello from GeekPython')# Printing the size of the fileprint(sp_file.__sizeof__())# Printing if data is written to the diskprint(sp_file._rolled)print(sp_file._file)

With the max_size parameter set to 10, we used the SpooledTemporaryFile() function to create a spooled temporary file. It indicates that file size up to 10 can be stored in the memory before it is rolled over to the on-disk file.

Then we wrote some data into the file and printed the size of the file using the __sizeof__ attribute.

There is a rollover() method by which we can see whether the data is in the memory or rolled over to the on-disk temporary file. This method returns a boolean value, True means the data is rolled over and False means the data is still in the memory of the spooled temporary file.

0x0000023FFBD0C370>32True0x0000023FFBD0D930>

The temporary file object is SpooledTemporaryFile, as can be seen. Because the amount of data we wrote into the file exceeded the allowed size, the data was rolled over, and we received the boolean value True.

Depending on the mode we specified, this function returns a file-like object whose _file attribute is either an io.BytesIO or an io.TextIOWrapper (binary or text). Up until the data exceeded the max size, this function holds the data in memory using the io.BytesIO or io.TextIOWrapper buffer.

import tempfilewith tempfile.SpooledTemporaryFile(mode="w+t", max_size=50) as sp_file:    print(sp_file)    # Running a while loop until the data gets rolled over    while sp_file._rolled == False:        sp_file.write("Welcome to GeekPython")        print(sp_file._rolled, sp_file._file)

Output

0x000001A8901CC370>False <_io.TextIOWrapper encoding='cp1252'>False <_io.TextIOWrapper encoding='cp1252'>True 0x000001A8906388E0>

Until the data is rolled over, we ran a while loop and printed the _file attribute of the SpooledTemporaryFile object. As we can see the data was stored in the memory using the io.TextIOWrapper buffer(mode was set to text) and when the data was rolled over, the code returned the temporary file object.

We can do the same with data in binary format.

import tempfilewith tempfile.SpooledTemporaryFile(mode="w+b", max_size=50) as sp_file:    print(sp_file)    # Running a while loop until the data gets rolled over    while sp_file._rolled == False:        sp_file.write(b"Welcome to GeekPython")        print(sp_file._rolled, sp_file._file)

Output

0x00000223C633C370>False <_io.BytesIO object at 0x00000223C6331580>False <_io.BytesIO object at 0x00000223C6331580>True 0x00000223C67A8850>

We can also call fileno() to roll over the file content.

import tempfilewith tempfile.SpooledTemporaryFile(mode="w+b", max_size=500) as sp_file:    print(sp_file)    for _ in range(3):        sp_file.write(b'Hey, there welcome')        print(sp_file._rolled, sp_file._file)    print('Data is written to the disk before reaching the max size.')    # Calling the fileno() method    sp_file.fileno()    print(sp_file._rolled, sp_file._file)

Output

0x000001F51D12C370>False <_io.BytesIO object at 0x000001F51D121580>False <_io.BytesIO object at 0x000001F51D121580>False <_io.BytesIO object at 0x000001F51D121580>Data is written to the disk before reaching the max size.True 0x000001F51D5A0970>

Low-level functions

There are some low-level functions included in the tempfile module. We'll explore them one by one.

mkstemp and mkdtemp

mkstemp is used for creating a temporary file in the most secure manner possible and mkdtemp is used to create a temporary directory in the most secure manner possible.

They also have parameters like suffix, prefix and dir but mkstemp has one additional text parameter which defaults to False(binary mode), if we make it True then the file will be opened in text mode.

import tempfile# Using mkstempfile = tempfile.mkstemp(prefix='hello-', suffix='-world')print('Created File:', file)# Using mkdtempdirectory = tempfile.mkdtemp(dir='TempDir')print('Directory Created:', directory)----------Created File: (3, 'C:\\Users\\SACHIN\\AppData\\Local\\Temp\\hello-xmtqn88f-world')Directory Created: TempDir\tmp_6coyqq2

Note: There is a function called mktemp() also which is deprecated.

gettempdir and gettempdirb

gettempdir() is used to return the name of the directory used for temporary files and gettempdirb() also return the name of the directory but in bytes.

import tempfileprint(tempfile.gettempdir())print(tempfile.gettempdirb())----------C:\Users\SACHIN\AppData\Local\Tempb'C:\\Users\\SACHIN\\AppData\\Local\\Temp'

Conclusion

The tempfile module, which offers functions to create temporary files and directories, was covered in this article. We have used various functions from the tempfile module in code examples, which has helped us better understand how these functions operate.

🏆Other articles you might be interested in if you liked this one

Use match case statement for pattern matching in Python.

Types of class inheritance in Python with examples.

An improved and modern way of string formatting in Python.

Integrate PostgreSQL database with Python.

Argmax function in NumPy and TensorFlow - Are they the same?

Take multiple inputs from the user in a single line in Python.

Perform a parallel iteration over multiple iterables using zip() function in Python.

That's all for now

Keep Coding

Team - GeekPython

Type Hinting in Python

What is Type Hint?

Performing a Check

Annotating Multiple Return Types

Alternative Types for a Return Value

Multiple Return Values of Different Types

Type Hinting Functions

Type Hinting Iterables

Type Aliases for Better Readability

Type Checker Tools

Static Type Checking Using Mypy

Resources

Conclusion

Best Practices: Positional and Keyword Arguments

Positional & Keyword Arguments

Stick to the Rule

Restrict Arguments to be Positional-only and Keyword-only

Optional Arguments/Arguments with Default Value

Variadic Arguments

Argument Ordering

Argument Type Hinting

Naming the Arguments Properly

Conclusion

Creating a MySQL Database in Python

Installing PyMySQL

Creating MySQL Database

Interacting with Database

Creating a Database Table

Adding Data to the Database

Reading Data from the Database

Updating Data in the Database

Deleting Data from the Database

Conclusion

Slash(/) and Asterisk(*) in Function Definition

Why Slash and Asterisk are Used?

Slash and Asterisk in Function Parameter

Can Asterisk Be Used Ahead of Slash

Using Either One of Slash (/) or Asterisk (*) in the Function's Parameter?

Writing a Function that Accepts Positional-only Arguments

Writing a Function that Accepts Keyword-only Arguments

Valid & Invalid Function Definitions

Conclusion

A Comprehensive Guide to Decorators in Python

Decorator

Creating a Custom Decorator

How Decorators Work?

Defining Decorator Without Inner Function

Handling Function Arguments Within Decorator

Defining Decorator With Inner Function to Handle Function Arguments

Returning Values from Decorator

Creating Decorator that Accepts Argument

Stacking Multiple Decorators on Top of a Function

Practical Example

Conclusion

Python's yield Keyword - How it Works

yield Keyword

yield in a Function

How yield Works in a Function?

StopIteration Exception

When to Use yield?

Conclusion

Parse TOML files Using tomllib Module in Python

TOML File

tomllib - Parsing TOML Files

Functions

Parsing a TOML File

Exception Handling

Loading TOML from String

Conclusion

How to Use split() Method in Python

Syntax

Splitting based on delimiter

Using maxsplit

Example

Conclusion

Python's __getitem__ Method

The __getitem__ Method

Syntax

Example

Python's getitem Method

The getitem Method

What is if name == 'main' in Python Programs

Understanding if name == 'main'?