Development and Operations

Introducing the trio of software development, project management, and data science
Seif Sekalala
| January 24, 2023

In this blog post, the author introduces the new blog series about the titular three main disciplines or knowledge domains of software development, project management, and data science. Amidst the mercurial evolving global digital economy, how can job-seekers harness the lucrative value of those fields–esp. data science, vis-a-vis improving their employability?

 

Introduction/Overview:

To help us launch this blog series, I will gladly divulge two embarrassing truths. These are: 

  1. Despite my marked love of LinkedIn, and despite my decent / above-average levels of general knowledge, I cannot keep up with the ever-changing statistics or news reports vis-a-vis whether–at any given time, the global economy is favorable to job-seekers, or to employers, or is at equilibrium for all parties–i.e., governments, employers, and workers.
  2. Despite having rightfully earned those fancy three letters after my name, as well as a post-graduate certificate from the U. New Mexico & DS-Dojo, I (used to think I) hate math, or I (used to think I) cannot learn math; not even if my life depended on it!

 

Background:

Following my undergraduate years of college algebra and basic discrete math–and despite my hatred of mathematics since 2nd grade (chief culprit: multiplication tables!), I had fallen in love (head-over-heels indeed!) with the interdisciplinary field of research methods. And sure, I had lucked out in my Masters (of Arts in Communication Studies) program, as I only had to take the qualitative methods course.

 

Data Science Blog Series
A Venn-diagram depicting the disciplines/knowledge-domains of the new blog series.

 

But our instructor couldn’t really teach us about interpretive methods, ethnography, and qualitative interviewing etc., without at least “touching” on quantitative interviewing/surveys, quantitative data-analysis–e.g. via word counts, content-analysis, etc.

Fast-forward; year: 2012. Place: Drexel University–in Philadelphia, for my Ph.D. program (in Communication, Culture, and Media). This time, I had to face the dreaded mathematics/statistics monster. And I did, but grudgingly.

Let’s just get this over with, I naively thought; after all, besides passing this pesky required pre-qualifying exam course, who needs stats?!

 

About software development:

Fast-forward again; year: 2020. Place(s): Union, NJ and Wenzhou, Zhejiang Province; Hays, KS; and Philadelphia all over again. Five years after earning the Ph.D., I had to reckon with an unfair job loss, and chaotic seesaw-moves between China and the USA, and Philadelphia and Kansas, etc. 

Thus, one thing led to another, and soon enough, I was practicing algorithms and data-structures, learning about the basic “trouble-trio” of web-development–i.e., HTML, CSS, and JavaScript, etc.! 

 

Read more about Programming Languages

 

But like many other folks who try this route, I soon came face-to-face with that oh-so-debilitative monster: self-doubt! No way, I thought. I’m NOT cut out to be a software-engineer! I thus dropped out of the bootcamp I had enrolled in and continued my search for a suitable “plan-B” career.

 

About project management:

Eventually (around mid/late-2021), I discovered the interdisciplinary field of project management. Simply defined (e.g. by Te Wu, 2020; link), project management is

“A time-limited, purpose-driven, and often unique endeavor to create an outcome, service, product, or deliverable.”

One can also break down the constituent conceptual parts of the field (e.g. as defined by Belinda Goodrich, 2021; link) as: 

  • Project life cycle, 
  • Integration, 
  • Scope, 
  • Schedule, 
  • Cost, 
  • Quality, 
  • Resources, 
  • Communications, 
  • Risk, 
  • Procurement, 
  • Stakeholders, and 
  • Professional responsibility / ethics. 

 

Ah…yes! I had found my sweet spot, indeed. or, so I thought. 

 

Hard truths:

Eventually, I experienced a series of events that can be termed “slow-motion epiphanies” and hard truths. Among many, below are three prime examples.

 

Hard Truth 1: The quantifiability of life:

For instance, among other “random” models: one can generally presume–with about 95% certainty (ahem!)–that most of the phenomena we experience in life can be categorized under three broad classes:

 

  1. Phenomena we can easily describe and order, using names (nominal variables);
  2. Phenomena we can easily group or measure in discrete and evenly-spaced amounts (ordinal variables);
  3. And phenomena that we can measure more accurately, and which: i)–is characterized by trait number two above, and ii)–has a true 0 (e.g., Wrench et Al; link).

 

Hard Truth 2: The probabilistic essence of life:

Regardless of our spiritual beliefs, or whether or not we hate math/science, etc., we can safely presume that the universe we live in is more or less a result of probabilistic processes (e.g., Feynman, 2013). 

 

Hard truth 3: What was that? “Show you the money (!),” you demanded? Sure! But first, show me your quantitative literacy, and critical-thinking skills!

And finally, related to both the above realizations: while it is true indeed that there are no guarantees in life, we can nonetheless safely presume that professionals can improve their marketability by demonstrating their critical-thinking-, as well as quantitative literacy skills.

 

Bottomline; The value of data science:

Overall, the above three hard truths are prototypical examples of the underlying rationale(s) for this blog series. Each week, DS-Dojo will present our readers with some “food for thought” vis-a-vis how to harness the priceless value of data science and various other software-development and project-management skills / (sub-)topics. 

 

No, dear reader; please do not be fooled by that “OmG, AI is replacing us (!)” fallacy. Regardless of how “awesome” all these new fancy AI tools are, the human touch is indispensable!

LAMP vs LEMP – Open-source solution stacks
Saad Shaikh
| December 10, 2022

Data Science Dojo is offering LAMP and LEMP for FREE on Azure Marketplace packaged with pre-installed components for Linux Ubuntu. 

 

What are web stacks? 

 

A complete application development environment is created by solution stacks, which are collections of separate components. These multiple layers in a web solution stack communicate and connect with each other to form a comprehensive system for the developers to create websites with efficiency and flexibility. The compatibility and frequent use of these components together make them suitable for a stack. 

LAMP vs LEMP 

 

Now what do these two terms mean? Have a look at the table below: 

 

LAMP 

LEMP 

1.  Stands for Linux, Apache, MySQL/MariaDB, PHP/Python/Perl  Stands for Linux, Nginx (Engine-X), MySQL/MariaDB, PHP/Python/Perl 
2.  Supports Apache2 web server for processing requests over http  Supports Nginx web server to transfer data over http 
3.  Can have heavy server configurations  Lightweight reverse proxy Nginx server  
4.  World’s first open-source stack for web development  Variation of LAMP, relatively new technology 
5.  Process driven design because of Apache  Event driven design because of Nginx 

 

Pro Tip: Join our 6-months instructor-led Data Science Bootcamp to master data science skills 

 

 

Challenges faced by web developers 

 

Developers often faced the challenge of optimal integration during web app development. Interoperability and interdependency issues are often encountered during the development and production phase.  

Apart from that, the conventional web stack would cause problems sometimes due to the heavy architecture of the web server. Thus, organizational websites had to suffer downtime. 

In this scenario, programming a website and managing a database from a single machine, connected to a web server without any interdependency issues, was thought to be an ideal solution which the developers were looking forward to deploy. 

 

Working of LAMP and LEMP 

 

LAMP & LEMP are open-source web stacks packaged with Apache2/Nginx web server, MySQL database, and PHP object-oriented programming language, running together on top of a Linux machine. Both stacks are used for building high-performance web applications. All layers of LAMP and LEMP are optimally compatible with each other, thus both the stacks are excellent if you want to host, serve, and manage web content. 

 

LAMP architecture
Figure 1: LAMP Architecture (Courtesy: https://www.javatpoint.com/what-is-lamp )

 

The LEMP architecture has the representation like that of LAMP except replace Apache with Nginx web server. 

 

Major features 

 

  • All the layers of LAMP and LEMP have potent connections with no interdependency issues 
  • They are open-source web stacks. LAMP huge support because of the experienced LAMP community 
  • Both provide blisteringly fast operability whether its querying, programming, or web server performance  
  • Both stacks are flexible which means that any open-source tool can be switched out and used against the pre-existing layers 
  • LEMP focuses on low memory usage and has a lightweight architecture 

 

What Data Science Dojo has for you 

 

LAMP & LEMP offers packaged by Data Science Dojo are open-source web stacks for creating efficient and flexible web applications with all respective components pre-configured without the burden of installation. 

  • A Linux Ubuntu VM pre-installed with all LAMP/LEMP components 
  • Database management system of MySQL for creating databases and handling web content 
  • Apache2/Nginx web server whose job is to process requests and send data via HTTP over the internet 
  • Support for PHP programming language which is used for fully functional web development 
  • PhpMyAdmin which can be accessed at http://your_ip/phpmyadmin 
  • Customizable, meaning users can replace each component with any other alternative open-end software 

 

Conclusion 

 

Both the above discussed stacks on the cloud guarantee high availability as data can be distributed across multiple data centers and availability zones on the go. In this way, Azure increases the fault tolerance of data stored in the stack application. The power of Azure ensures maximum performance and high throughput for the MySQL database by providing low latency for executing complex queries.

Since LEMP/LAMP is designed to create websites, the increase in web-related data can be adequately managed by scaling up. The flexibility, performance, and scalability provided by Azure virtual machine to LAMP/LEMP makes it possible to host, manage, and modify applications of all types despite any traffic. 

At Data Science Dojo, we deliver data science education, consulting, and technical services to increase the power of data. Don’t wait to install this offer by Data Science Dojo, your ideal companion in your journey to learn data science!  

Click on the buttons below to head over to the Azure Marketplace and deploy LAMP/LEMP for FREE by clicking on “Try now”. 

button_try-lemp-now

button_try-lamp-now

Note: You’ll have to sign up to Azure, for free, if you do not have an existing account. 

Load testing with Locust – A modern tool for quality assurance 
Saad Shaikh
| November 26, 2022

 Data Science Dojo is offering Locust for FREE on Azure Marketplace packaged with pre-configured Python interpreter and Locust web server for load testing. 

 

Why and when do we perform testing? 

Testing is an evaluation and confirmation that a software application or product performs as intended. The purpose of testing is to determine whether the application satisfies business requirements and whether the product is market ready. Applications can be subjected to automated testing to see if they meet the demands. Scripted sequences are used in this method of software testing, and testing tools carry them out. 

The merits of automated testing are: 

  • Bugs can be avoided 
  • Development costs can be reduced 
  • Performance can be improved till requirement 
  • Application quality can be enhanced 
  • Development time can be saved 

Testing is usually the last phase of the SDLC (Software Development Life Cycle)  

What is load testing and why choose Locust?  

Performance testing is one of several types of software testing. Load testing is an example of performance testing to evaluate performance under real-life load conditions. It involves the following stages: 

  • Define crucial metrics and scenarios 
  • Plan the test load model 
  • Write test scenarios 
  • Execute test by swarming load 
  • Analyze the test results 

It is a modern load testing framework. The major reason senior testers prefer it over other tools like JMeter is because it uses an event-based approach for testing rather than thread based. This results in less consumption of resources and thus saves costs. 

Pro Tip: Join our 6-months instructor-led Data Science Bootcamp to master data science skills 

Challenges faced by QA teams  

Before such feasible testing tools, the job of testing teams was not much easier as it is now. Swarming a large number of users to direct as a load on a website was expensive and time-consuming.  

Apart from this, monitoring the testing process in real time was not prevalent either. Complete analytics were usually drawn after the whole testing process concludes, which again required patience. 

The testers needed a platform through which they can evaluate quality of product and its compliance with the specified requirements under different loads without the prolonged wait and high expense. 

Working of Locust 

Locust is an open-source web-based load testing tool. It is based on python and is used to evaluate the functionality and behavior of the web application. For the quality assurance process in any business, load testing is an extremely critical element to assure that the website remains up during traffic influx as it will eventually contribute to the success of the company. Through Locust, web testers can determine the potential of the website to withstand the number of concurrent users. With the power of python, you can develop a set of test scenarios and functions that imitate many users and can observe performance charts on web UI. 

 

Locust file
Figure 1: A sample locustfile.py 

 

The self.client.get function points to the pages of a website that you want to target. You can find this code file and further breakdown here. The host domain, users and the spawn rate for the load testing are supplied at the web interface. After running the locust command, the web server is started at 8089. 

 

locust web interface
Figure 2: Locust web interface

 

It also allows you to capture different metrics during the testing process in real-time. 

 

graphs with metrics
Figure 3: Graphs with metrics visualizations

 

Key characteristics of Locust 

 

  • An interactive user-friendly web UI is started after executing the file through which you can perform load testing 
  • Locust is an open-source load-testing tool. It is extremely useful for web app testers, QA teams and software testing managers 
  • You can capture various metrics like response time, visualized in charts in real-time as the testing occurs 
  • Achieve increased throughput and high availability by writing test codes in pre-configured python interpreter 
  • You can easily scale up the number of users for extensive production level load testing of web applications 

 

What Data Science Dojo provides 

 

Locust instance packaged by Data Science Dojo comes with a pre-configured python interpreter to write test files, and a Locust web UI server to generate the desired amount of load at specific rates without the burden of installation.  

Features included in this offer:  

  • VM configured with Locust application which can start a web server with rich UX/UI 
  • Provides several interactive metrics graphs to visualize the testing results 
  • Provides real-time monitoring support 
  • Ability to download requests statistics, failures, exceptions, and test reports 
  • Feature to swarm multiple users at the desired spawn rate 
  • Support for python language to write complex workflows 
  • Utilizes event-based approach to use fewer resources 

Through Locust, load testing has been easier than ever. It has saved time and cost for businesses as QA engineers and web testers can perform testing now with few clicks and few lines of easy code. 

 

 

Conclusion 

 

Locust can be used to test any web application. By swarming many clients spawning at a specific rate, the functionality of a website can be assured that it can manage concurrent users. To achieve extensive load testing, you can use multi-cores on Azure Virtual Machine. Also, the Its web interface calculates metrics for every test run and visualizes them as well. This might slow down the server if you have hundreds upon hundreds of active test units requesting multiple pages. The CPU and RAM usage may also be affected but through Azure Virtual Machine this problem is taken care of. 

At Data Science Dojo, we deliver data science education, consulting, and technical services to increase the power of data. We are adding a free Locust application dedicated specifically for testing operations on Azure Marketplace. Now hurry up install this offer by Data Science Dojo, your ideal companion in your journey to learn data science!  

 

Click on the button below to head over to the Azure Marketplace and deploy Locust for FREE by clicking on “Try now” 

CTA - Try now 

Note: You will have to sign up to Azure, for free, if you do not have an existing account. 

Related Topics

Top
Statistics
Programming Language
Podcasts
Machine Learning
High-Tech
Events and Conferences
DSD Insights
Discussions
Development and Operations
Demos
Data Visualization
Data Security
Data Science
Data Engineering
Data Analytics
Computer Vision
Career
Books
Blogs

Finding our reads interesting?

Become a contributor today and share your data science insights with the community

Up for a Weekly Dose of Data Science?

Subscribe to our weekly newsletter & stay up-to-date with current data science news, blogs, and resources.