How-To

Bring your Python ML Model to iOS App in under Three Minutes

Integrating Python-based machine learning models into iOS applications can be challenging, particularly when converting models into a Swift-compatible format. This example will demonstrate a simple image classification task using the Fashion-MNIST dataset and CoreML conversion tools. The goal is to illustrate the effort required to deploy small-to-medium complexity ML models within iOS applications. The demonstration is based on a Convolutional Neural Network (CNN) built with PyTorch, but the concepts apply broadly to other Python-based models as well. ...

Automatically Refreshing NVim plugins

One of the key benefits of modern editors like NVim, Vim, or Emacs is the rich plugin ecosystem. After years with Vim, I switched to NVim and was immediately impressed by its plugin landscape. The Lazy plugin manager—available for NVim > 0.8—quickly became my favourite. Lazy simplifies plugin discovery and management. It offers an intuitive interface and powerful commands that make it easy to add, remove, or update plugins. ...

One line docker commands

Setting up a robust data science development environment takes time, and it’s a process that’s rarely ever finished. If you’re the type who likes to get the most out of your tools, you’ll likely enjoy tweaking, optimising, and layering your workspace with productivity enhancements. That might mean refining your Python setup to easily manage multiple language versions and dependencies, or expanding your text editor with plugins for linting, code suggestions, unit test execution, and CI/CD integration. ...

Version Control your Dotfiles

What are .dotfiles? Dotfiles are hidden configuration files on Unix-like systems. Their filenames start with a dot (.), making them hidden by default. They store preferences and settings for programs like shells, text editors, and version control systems. Many modern Linux applications follow the XDG Base Directory Specification. This guideline recommends placing user-specific configuration files in ~/.config (or $XDG_CONFIG_HOME). Using this standard reduces clutter in home directories and simplifies managing configurations across systems. ...

Aggresively formating your Python files

Vim provides a wide range of functions for file formatting, starting with basic features such as reindent. VimL Implementation Creating a function within Vim to process the file is likely the most straightforward approach. The primary purpose of this function is to pass the filename to an external command for formatting. Leveraging the rich ecosystem of Python formatting tools available from the command line allows the function to efficiently and consistently format files, tapping into powerful, pre-existing solutions for code aesthetics and standardization. In effect, the role of the function is to pass the filename to the call below: ...

Using RScript for R Installation Managment

Most frequently, users tend to undertake common R installation and management tasks from within the R session. Frequently making use of commands, like install.packages, update.packages or old.packages to obtain or update packages or update/verify the existing packages. Those common tasks can also be accomplished via the GUI offered within RStudio, which provides an effortless mechanism for undertaking basic package management tasks. This is approach is usually sufficient for the vast majority of cases; however, there are some examples when working within REPL^[REPL stands for Read Eval Print Loop and is usually delivered in a form of an interactive shell. While working in Python users would commonly access REPLY by running python or ipython, more details.] to accomplish common installation tasks is not hugely convenient. ...

R-based metaprogramming strategies for handling Hive/CSV interaction (Part I, imports)

Background Handling Hive/CSV interaction is a common reality of many analytical and data environments. The question on exporting data from Hive to CSV and other formats is frequently raised on online forums with answers frequently suggesting making use of sed that combined with nifty regular expressions pipes Hive output into a flat CSV files as an exporting solution. Import of large amounts of data is best handled by suitable tools like Apache Flume. That is fine for simpler tables but may prove problematic for tables with a large amount of unstructured text. Frequently analysts and data scientists are faced with a challenge with storing data Hive on a irregular semi-regular basis. For instance, a job may produce new forecasting scenarios that we may want to make available through a Hive tables. ...

Using R for File Manipulation

Challenge File manipulation is a frequent task unavoidable in almost every IT business process. Traditionally, file manipulation tasks are accomplished within the ramifications of specific tools native to a given system. As such, the one may consider writing and scheduling shell script to undertake frequent file operations or using more specific purpose-built tools like logrotate in order to archive logs or tools like Kafka are used to build streaming-data pipelines. R is usually though of as a statistical programming language or as an environment for a statistical analysis. The fact that R is a mature programming language able to successfully accomplish a wide array of traditional tasks is frequently ignored. What constitutes a programming language is a valid question. Wikipedia offers somehow wide definition: ...

Inserting Data into Partitioned Table

Rationale Maintaining partitioned Hive tables is a frequent practice in a business. Properly structured tables are conducive to achieving robust performance through speeding up query execution (see Costa, Costa, and Santos 2019). Frequent use cases pertain to creating tables with hierarchical partition structure. In context of a data that is refreshed daily, the frequently utilised partition structure reflects years, months and dates. Creating partitioned table In HiveQL we would create the table with the following structure using the syntax below. In order to keep the development tidy, I’m creating a separate database on Hive which I will use for the purpose of creating tables for this article. ...

Poor Man's Robust Shiny App Deployment (Part II)

Introduction This article draws on the past post concerned with utilisation of golem for robust deployment of analytical and reporting solutions. For this article, we will assume that we are working with defined working requirements that utilise some of the Labour Market Statistics disseminated through the nomis portal. Change Plan What we have Reporting requirements Past scripts we used to create reports with accompanying instructions What we want Stronger business continuity - we want to be able to give some access to this project and don’t be concerned with missing files, outdated unavailable documentation and questions on how to produce updated reports. We want self-encompassing entity that takes of care of its technical requirements and user-interaction^[Good parallel can be drawn between this approach and manuals available with life-saving equipment. Equipment delivers technical capacity and manual ensures operational capacity. In case of an inexperienced user one is not useful without the other. We want to ensure that user with minimum required capacity can use the tools correctly.] ...