Gist Blog

Django: Bulk update any field in any model with a generic management script!

Generic Bulk Updating

After writing up a number of CLI scripts to update various model values in a Django app, it occurred to me that a generic script could be written that would work for all of my purposes. What follows is a simple catch-all script that can be used in any django app as is!

The script works by taking in the following values:

  • Model name
  • Field name
  • New value type
  • New value
  • Comma separated list of IDs

Setup

To use this script, place it in the following dir in your project: <project>/<app>/management/commands/.

The script as written depends on the rich library: pip install rich.

Tutorial

Let's say I have a model defining a Person that includes a deleted field acting as a soft delete.

from django.db import models


class Person(models.Model):
    first_name = models.CharField(max_length=30)
    last_name = models.CharField(max_length=30)
    created = models.DateTimeField(auto_now_add=True, null=False)
    deleted = models.BooleanField(default=False)

Say I am given a task to soft delete all users that were created before 2020, here is a possible workflow I can use:

  1. Query my table for a list of IDs to update:
    SELECT STRING_AGG(id::text, ',')
    FROM appname_person
    WHERE created < '2020-01-01' AND deleted = false;
    
    Here I return a list of the IDs for records I want to update. Note that the results here will be a comma separated list, ready for use on the commandline: 1,2,4
  2. Next I run the following command to update the deleted field:
    $ python manage.py update_model_field.py Person deleted bool true 1,2,4
    
    Updating 3 record(s) from the model: appname.models.Person
    The field appname.models.Person.deleted will be updated to: bool(True)
    Would you like to continue? [y/N]
    
  3. After I enter y the script completes, I will see Done

Possible Improvements

  • Allow a file of IDs to be passed as the input IDs. This is more manageable for a large number of IDs.
  • Remove rich as a dependency.
  • Reference model without use of globals: https://stackoverflow.com/a/75888308
  • The lambdas that convert the types as the values for ValueTypes:
    class ValueTypes(enum.Enum):
        INT = lambda x: int(x)
        STR = lambda x: str(x)
        FLOAT = lambda x: float(x)
        DATETIME = lambda x: datetime.strptime(x, "%m/%d/%y %H:%M:%S")
        DATETIME_NOW = lambda: datetime.now()
        NONE = lambda: None
    
import logging
from datetime import datetime  # noqa: F403,F401
from enum import Enum
from typing import Any, List

from django.contrib.auth import get_user_model  # noqa: F403,F401
from django.core.management.base import BaseCommand
from django.db import models  # noqa: F403,F401
from django.db.models import Field
from rich.logging import RichHandler

from myapp.models import *  # noqa: F403,F401

User = get_user_model()

logging.basicConfig(
    level="NOTSET",
    format="%(message)s",
    datefmt="[%X]",
    handlers=[RichHandler(rich_tracebacks=True)],
)

log = logging.getLogger("rich")


class ModelUpdater:
    class ValueTypes(Enum):
        int = "int"
        str = "str"
        float = "float"
        datetime = "datetime"
        datetime_now = "datetime_now"
        none = "None"
        bool = "bool"

        def __str__(self):
            return self.value

    def __init__(
        self,
        model: models.Model,
        model_ids: List[int],
        field_to_update: str,
        new_value: Any,
        new_value_type: str,
    ):
        self.model = model
        self.model_ids = set(model_ids)
        self.records = self.model.objects.filter(id__in=self.model_ids)
        self.field_to_update = field_to_update
        self.__new_value = new_value
        self.new_value_type = new_value_type

        if not self.records:
            raise ValueError("No valid record IDs provided to ModelUpdater")

    @property
    def new_value(self) -> Any:
        """
        Cast the new value to its correct type
        :return:
        """
        if self.new_value_type == self.ValueTypes.int:
            return int(self.__new_value)
        elif self.new_value_type == self.ValueTypes.str:
            return str(self.__new_value)
        elif self.new_value_type == self.ValueTypes.none:
            return
        elif self.new_value_type == self.ValueTypes.float:
            return float(self.__new_value)
        elif self.new_value_type == self.ValueTypes.datetime:
            # format example: '09/19/22 13:55:26'
            return datetime.strptime(self.__new_value, "%m/%d/%y %H:%M:%S")
        elif self.new_value_type == self.ValueTypes.datetime_now:
            # format example: '09/19/22 13:55:26'
            return datetime.now()
        elif self.new_value_type == self.ValueTypes.bool:
            return self.__new_value.lower() in ["true", "1"]
        else:
            raise ValueError(f"{self.new_value_type=} is not a supported type.")

    @property
    def model_field_being_updated(self) -> Field:
        return self.model._meta.get_field(self.field_to_update)

    def run(self):
        change = {self.field_to_update: self.new_value}
        self.records.update(**change)


class Command(BaseCommand):
    help = "Change the value on a model field in bulk"

    def add_arguments(self, parser):
        parser.add_argument(
            "model",
            type=str,
            help="The name of the model you'd like to soft delete records from",
        )
        parser.add_argument(
            "field_name", type=str, help="Field on the model that is to be updated"
        )
        parser.add_argument(
            "new_value", type=str, help="The value the field should be updated to"
        )
        parser.add_argument(
            "new_value_type",
            type=ModelUpdater.ValueTypes,
            choices=list(ModelUpdater.ValueTypes),
            help="The type of the value",
        )
        parser.add_argument("ids", type=str, help="comma separated string of IDs")

    def handle(self, *args, **options):
        updater = ModelUpdater(
            globals()[options["model"]],
            list(map(int, options["ids"].split(","))),
            options["field_name"],
            options["new_value"],
            options["new_value_type"],
        )
        log.info(
            f"Updating {len(updater.records)} record(s) from the model: {updater.model} "
        )
        log.info(
            f"The field '{updater.model_field_being_updated}' will be updated to: "
            f"{updater.new_value.__class__.__name__}({updater.new_value})"
        )
        if input("Would you like to continue? [y/N]").strip().lower() != "y":
            log.info("Aborting....")
            exit()
        else:
            updater.run()
            log.info("Done")

Comments

Author
Apr 21, 2023

Future updates:

  1. Reference model without use of globals: https://stackoverflow.com/a/75888308
  2. Use lambda's in enum to convert values
In [13]:     class ValueTypes(enum.Enum):
    ...:         INT = lambda x: int(x)
    ...:         STR = lambda x: str(x)
    ...:         FLOAT = lambda x: float(x)
    ...:         DATETIME = lambda x: datetime.strptime(x, "%m/%d/%y %H:%M:%S")
    ...:         DATETIME_NOW = lambda: datetime.now()
    ...:         NONE = lambda: None
    ...:
    ...:         def __str__(self):
    ...:             return self.value

To make a comment, please visit this posts Gist.

Add your comment!