Learn to build large language model applications: vector databases, langchain, fine tuning and prompt engineering. Learn more

data analytics

Data Analysis Expressions (DAX) is a language used in Analysis Services, Power BI, and Power Pivot in Excel. DAX formulas include functions, operators, and values to perform advanced calculations and queries on data in related tables and columns in tabular data models. 

 The Basics of DAX for Data Analysis 

DAX is a powerful language that can be used to create dynamic and informative reports that can help you make better decisions. By understanding the basics of Data Analysis Expressions, you can: 

  • Perform advanced calculations on data 
  • Create dynamic filters and calculations 
  • Create measures that can be used in reports 
  • Build tabular data models 
Data Analysis Expressions
Data Analysis Expressions

Creating DAX Tables, Columns, and Measures 

Data Analysis Expression tables are similar to Excel tables, but they can contain calculated columns and measures. Calculated columns are formulas that are applied to all rows in a column, while measures are formulas that are calculated based on data in multiple columns. 

To create a DAX table, right-click on the Tables pane and select New Table. In the Create Table dialog box, enter a name for the table and select the columns that you want to include. 

To create a calculated column, right-click on the Columns pane and select New Calculated Column. In the Create Calculated Column dialog box, enter a name for the column and type in the formula that you want to use.

To create a measure, right-click on the Measures pane and select New Measure. In the Create Measure dialog box, enter a name for the measure and type in the formula that you want to use. 

Executing DAX Operators 

Data Analysis Expressions operators are used to perform calculations on data. Some common DAX operators include: 

  • Arithmetic operators: These operators are used to perform basic arithmetic operations, such as addition, subtraction, multiplication, and division. 
  • Comparison operators: These operators are used to compare two values and return a Boolean value (true or false). 
  • Logical operators: These operators are used to combine Boolean values and return a Boolean value. 
  • Text operators: These operators are used to manipulate text strings. 

Read more –> Data Analysis Roadmap 101: A step-by-step guide

Discussing Basic Math & Statistical Functions 

DAX includes a wide variety of mathematical and statistical functions that can be used to perform calculations on data. Some common mathematical and statistical functions include: 

  • SUM: This function returns the sum of all values in a column or range. 
  • AVERAGE: This function returns the average of all values in a column or range. 
  • COUNT: This function returns the number of non-empty values in a column or range. 
  • MAX: This function returns the maximum value in a column or range. 
  • MIN: This function returns the minimum value in a column or range. 
DAX Functions
DAX Functions

Implementing Date & Time Functions 

Data Analysis Expressions includes many date and time functions that can be used to manipulate date and time data. Some common date and time functions include: 

  • DATEADD: This function adds a specified number of days, months, years, or hours to a date. 
  • DATEDIFF: This function returns the number of days, months, years, or hours between two dates. 
  • TODAY: This function returns the current date. 
  • NOW: This function returns the current date and time. 

Using Text Functions 

DAX includes several text functions that can be used to manipulate text data. Some common text functions include: 

  • LEFT: This function returns the leftmost characters of a string. 
  • RIGHT: This function returns the rightmost characters of a string. 
  • MID: This function returns a substring from a string. 
  • LEN: This function returns the length of a string. 
  • TRIM: This function removes leading and trailing spaces from a string. 

Using calculate & filter functions 

Data Analysis Expressions includes several calculate and filter functions that can be used to create dynamic calculations and filters. Some common calculate and filter functions include: 

  • CALCULATE: This function allows you to create dynamic calculations that are based on the current context. 
  • FILTER: This function allows you to filter data based on a condition. 

Summing up Data Analysis Expressions (DAX) 

Data Analysis Expressions is a powerful language that can be used to perform advanced calculations and queries on data in Analysis Services, Power BI, and Power Pivot in Excel. By understanding the basics of DAX, you can create dynamic and informative reports that can help you make better decisions. 

July 21, 2023

Many people who operate internet businesses find the concept of big data to be rather unclear. They are aware that it exists, and they have been told that it may be helpful, but they do not know how to make it relevant to their company’s operations. 

Using small amounts of data at first is the most effective strategy to begin a big data revolution. There is a need for meaningful data and insights in every single company organization, regardless of size.

Big data plays a very crucial role in the process of gaining knowledge of your target audience as well as the preferences of your customers. It enables you to even predict their requirements. The appropriate data has to be provided understandably and thoroughly assessed. A corporate organization can accomplish a variety of objectives with its assistance. 


Understanding Big Data
Understanding Big Data


Nowadays, you can choose from a plethora of Big Data organizations. However, selecting a firm that can provide Big Data services heavily depends on the requirements that you have.

Big Data Companies USA not only provides corporations with frameworks, computing facilities, and pre-packaged tools, but they also assist businesses in scaling with cloud-based big data solutions. They assist organizations in determining their big data strategy and provide consulting services on how to improve company performance by revealing the potential of data. 

The big data revolution has the potential to open up many new opportunities for business expansion. It offers the below ideas. 


Competence in certain areas

You can be a start-up company with an idea or an established company with a defined solution roadmap. The primary focus of your efforts should be directed toward identifying the appropriate business that can materialize either your concept or the POC. The amount of expertise that the data engineers have, as well as the technological foundation they come from, should be the top priorities when selecting a firm. 

Development team 

Getting your development team and the Big Data service provider on the same page is one of the many benefits of forming a partnership with a Big Data service provider. These individuals have to be imaginative and forward-thinking, in a position to comprehend your requirements and to be able to provide even more advantageous choices.

You may be able to assemble the most talented group of people, but the collaboration won’t bear fruit until everyone on the team shares your perspective on the project. After you have determined that the team members’ hard talents meet your criteria, you may find that it is necessary to examine the soft skills that they possess. 


Cost and placement considerations 

The geographical location of the organization and the total cost of the project are two other elements that might affect the software development process. For instance, you may decide to go with in-house development services, but keep in mind that these kinds of services are almost usually more expensive.

It’s possible that rather than getting the complete team, you’ll wind up with only two or three engineers who can work within your financial constraints. But why should one pay extra for a lower-quality result? When outsourcing your development team, choose a nation that is located in a time zone that is most convenient for you. 


In today’s business world, feedback is the most important factor in determining which organizations come out on top. Find out what other people think about the firm you’d want to associate with so that you may avoid any unpleasant surprises. Using these online resources will be of great assistance to you in concluding.


What role does big data play in businesses across different industries?

Among the most prominent sectors now using big data solutions are the retail and financial sectors, followed by e-commerce, manufacturing, and telecommunications. When it comes to streamlining their operations and better managing their data flow, business owners are increasingly investing in big data solutions. Big data solutions are becoming more popular among vendors as a means of improving supply chain management. 

  • In the financial industry, it can be used to detect fraud, manage risk, and identify new market opportunities.
  • In the retail industry, it can be used to analyze consumer behavior and preferences, leading to more targeted marketing strategies and improved customer experiences.
  • In the manufacturing industry, it can be used to optimize supply chain management and improve operational efficiency.
  • In the energy industry, it can be used to monitor and manage power grids, leading to more reliable and efficient energy distribution.
  • In the transportation industry, it can be used to optimize routes, reduce congestion, and improve safety.

Bottom line to the big data revolution

Big data, which refers to extensive volumes of historical data, facilitates the identification of important patterns and the formation of more sound judgments. Big data is affecting our marketing strategy as well as affecting the way we operate at this point. Big data analytics are being put to use by governments, businesses, research institutions, IT subcontractors, and teams to delve more deeply into the mountains of data and, as a result, come to more informed conclusions.

May 8, 2023

The COVID-19 pandemic threw businesses into uncharted waters. Suddenly, digital transformation was more important than ever, and companies had to pivot quickly or risk extinction. And the humble QR code – once dismissed as a relic of the past – became an unlikely hero in this story. 

QR tech’s versatility and convenience allowed businesses, both large and small, to stay afloat amid challenging circumstances and even inspired some impressive growth along the way. But the real magic happened when data analytics was added to the mix. 


You see, when QR code was paired with data analytics, companies could see the impact of their actions in real-time. They were able to track customer engagement, spot trends, and get precious new insights into their customers’ preferences. This newfound knowledge enabled companies to create superior strategies, refine their campaigns, and more accurately target their audience.  

The result? Faster growth that’s both measurable and sustainable. Read on to find out how you, too, can use data analytics and QR codes to supercharge your business growth. 

Why use QR codes to track data? 

Did you ever put in a lot of effort and time to craft the perfect marketing campaign only to be left wondering how effective it was? How many people viewed it, how many responded, and what was the return on investment?  

Before, tracking offline campaigns’ MROI (Marketing Return on Investment) was an inconvenient and time-consuming process. Businesses used to rely on coupon codes and traditional media or surveys to measure campaign success.

For example, say you put up a billboard ad. Now without any coupon codes or asking people how they found out about you, it was almost impossible to know if someone had even seen the ad, let alone acted on it. But the game changed when data tracking enabled QR codes came in.

Adding these nifty pieces of technology to your offline campaigns allows you to collect valuable data and track customer behavior. All the customers have to do is scan your code, which will take them to a webpage or a landing page of your choosing. In the process, you’ll capture not only first-party data from your audience but also valuable insights into the success of your campaigns. 

For instance, if you have installed the same billboard campaign in two different locations, a QR code analytics dashboard can help you compare the results to determine which one is more effective. Say 2000 people scanned the code in location A, while only 500 scanned it in location B. That’s valuable intel you can use to adjust your strategy and ensure all your offline campaigns perform at their best. 

How does data analytics fit in the picture? 

Once you’ve employed QR codes and started tracking your campaigns, it’s time to play your trump card – analytics. 

Extracting wisdom from your data is what turns your campaigns from good to great. Analytics tools can help you dig deep into the numbers, find correlations and uncover insights to help you optimize your campaigns and boost conversions. 

For example, using trackable codes, you can find out the number of scans. But adding analytics tools to the mix can reveal how long users interacted with the content after scanning your code, what locations yielded the most scans, and more.

This transforms your data from merely informative to actionable. And arming yourself with these kinds of powerful insights will go a long way in helping you make smarter decisions and accelerate your growth. 

Getting started with QR code analytics 

Ready to start leveraging the power of QR codes and analytics? Here’s a step-by-step guide to getting started: 

Step 1: Evaluate QR codes’ suitability for your strategy 

Before you begin, ask yourself if a QR code project is actually in line with your current resource capacity and target audience. If you’re trying to target a tech-savvy group of millennials who lead busy lives, they could be the perfect solution. But it may not be the best choice if you’re aiming for an older demographic who may struggle with technology.  

Plus, keep in mind that you’ll also need dedicated resources to continually track and manage your project and the data it’ll yield. As such, make certain you have the right resource support lined up before diving in. 

Step 2: Get yourself a solid QR code generator 

The next step is to find a reliable and feature-rich QR code generator. A good one should allow you to customize your codes, track scans, and easily integrate with your other analytics tools. The internet is full of such QR code generators, so do your research, read reviews, and pick the best one that meets your needs. 

Step 3: Choose your QR code type 

QR codes come in two major types:  

  1. Static QR codes – They are the most basic type of code that points to a single, predefined destination URL and don’t allow for any data tracking.  
  2. Dynamic/ trackable QR codes – These are the codes we’ve been talking about. They are far more sophisticated as they allow you to track and measure scans, collect vital data points, and even change the destination URL on the fly if needed.

For the purpose of analytics, you will have to opt for dynamic /trackable QR codes. 

Step 4: Design and generate QR code

Now that you have your QR code generator and type sorted, you can start with the QR code creation process. Depending on the generator you picked, this can take a few clicks or involve a bit of coding.

But be sure to dress up your QR codes with your brand colors and an enticing call to action to encourage scans. A visually appealing code will be far more likely to pique people’s interest and encourage them to take action than a dull, black-and-white one. 

Step 5: Download and print out the QR code 

Once you have your code ready, save it and print it out. But before printing a big batch of copies to use in your campaigns, test your code to ensure it works as expected. Scan it from different devices and check the destination URL to verify everything is good before moving ahead with your campaign. 

Step 6: Start analyzing the data 

Most good QR code generators come with built-in analytics or allow you to integrate with popular tools like Google Analytics. So you can either go with the integrated analytics or hook up your code with your analytics tool of choice. 

Industry use cases using QR codes and analytics 

QR codes, when combined with analytics tools, can be incredibly powerful in driving business growth. Let’s look at some use cases that demonstrate the potential of this dynamic duo. 

1. Real estate – Real estate agents can use QR codes to give potential buyers a virtual tour of their properties. This tech can also be used to provide comprehensive information about the property, like floor plans and features. Furthermore, with analytics integration, real estate agents can track how many people access property information and view demographic data to better understand each property’s target market.  

2. Coaching/ Mentorship – A coaching business can use QR codes to target potential clients and measure the effectiveness of their coaching materials. For example, coaches could test different versions of their materials and track how many people scanned each QR code to determine which version resonated best with their target audience. Statistics derived from this method will let them refine their materials, hike up engagement and create a higher-end curriculum. 

3. Retail – They are an excellent way for retailers to engage customers in their stores and get detailed metrics on their shopping behavior. Retailers can create links to product pages, add loyalty programs and coupons, or offer discounts on future purchases. All these activities can be tracked using analytics, so retailers can understand customer preferences and tailor their promotions accordingly. 

QR codes and data analytics: A dynamic partnership

No longer confined to the sidelines, tech’s newfound usage has propelled it to the forefront of modern marketing and technology. By combining codes with analytics tools, you can unlock boundless opportunities to streamline processes, engage customers and drive your business further. This tried-and-true, powerful partnership is the best way to move your company digitally forward. 

March 22, 2023

Data analytics is the driving force behind innovation, and staying ahead of the curve has never been more critical. That is why we have scoured the landscape to bring you the crème de la crème of data analytics conferences in 2023.  

Data analytics conferences provide an essential platform for professionals and enthusiasts to stay current on the latest developments and trends in the field. By attending these conferences, attendees can gain new insights, and enhance their skills in data analytics.

These events bring together experts, practitioners, and thought leaders from various industries and backgrounds to share their experiences and best practices. Such conferences also provide an opportunity to network with peers and make new connections.  

Data analytics conferences to look forward to

In 2023, there will be several conferences dedicated to this field, where experts from around the world will come together to share their knowledge and insights. In this blog, we will dive into the top data analytics conferences of 2023 that data professionals and enthusiasts should add to their calendars.

Top Data Analytics Conferences in 2023
      Top Data Analytics Conferences in 2023 – Data Science Dojo

Strata Data Conference   

The Strata Data Conference is one of the largest and most comprehensive data conferences in the world. It is organized by O’Reilly Media and will take place in San Francisco, CA in 2023. It is a leading event in data analytics and technology, focusing on data and AI to drive business value and innovation. The conference brings together professionals from various industries, including finance, healthcare, retail, and technology, to discuss the latest trends, challenges, and solutions in the field of data analytics.   

This conference will bring together some of the leading data scientists, engineers, and executives from across the world to discuss the latest trends, technologies, and challenges in data analytics. The conference will cover a wide range of topics, including artificial intelligence, machine learning, big data, cloud computing, and more. 

Big Data & Analytics Innovation Summit  

The Big Data & Analytics Innovation Summit is a premier conference that brings together experts from various industries to discuss the latest trends, challenges, and solutions in data analytics. The conference will take place in London, England in 2023 and will feature keynotes, panel discussions, and hands-on workshops focused on topics such as machine learning, artificial intelligence, data management, and more.  

Attendees can attend keynote speeches, technical sessions, and interactive workshops, where they can learn about the latest technologies and techniques for collecting, processing, and analyzing big data to drive business outcomes and make informed decisions. The connection between the Big Data & Analytics Innovation Summit and data analytics lies in its focus on the importance of big data and the impact it has on businesses and industries. 

Predictive Analytics World   

Predictive Analytics World is among the leading data analytics conferences that focus specifically on the applications of predictive analytics. It will take place in Las Vegas, NV in 2023. Attendees will learn about the latest trends, technologies, and solutions in predictive analytics and gain valuable insights into this field’s future.  

At PAW, attendees can learn about the latest advances in predictive analytics, including techniques for data collection, data preprocessing, model selection, and model evaluation. For the unversed, Predictive analytics is a branch of data analytics that uses historical data, statistical algorithms, and machine learning techniques to make predictions about future events. 

AI World Conference & Expo   

The AI World Conference & Expo is a leading conference focused on artificial intelligence and its applications in various industries. The conference will take place in Boston, MA in 2023 and will feature keynote speeches, panel discussions, and hands-on workshops from leading AI experts, business leaders, and data scientists. Attendees will learn about the latest trends, technologies, and solutions in AI and gain valuable insights into this field’s future.  

The connection between the AI World Conference & Expo and data analytics lies in its focus on the importance of AI and data in driving business value and innovation. It highlights the significance of AI and data in enhancing business value and innovation. The event offers attendees an opportunity to learn from leading experts in the field, connect with other professionals, and stay informed about the most recent developments in AI and data analytics. 

Data Science Summit   

Last on the data analytics conference list we have the Data Science Summit. It is a premier conference focused on data science applications in various industries. The meeting will take place in San Diego, CA in 2023 and feature keynote speeches, panel discussions, and hands-on workshops from leading data scientists, business leaders, and industry experts. Attendees will learn about the latest trends, technologies, and solutions in data science and gain valuable insights into this field’s future.  

Special mention – Future of Data and AI

Hosted by Data Science Dojo, Future of Data and AI is an unparalleled opportunity to connect with top industry leaders and stay at the forefront of the latest advancements. Featuring 20+ industry experts, the two-day virtual conference offers a diverse range of expert-level knowledge and training opportunities.

Don’t worry if you missed out on the Future of Data and AI Conference! You can still catch all the amazing insights and knowledge from industry experts by watching the conference on YouTube.

Bottom line

In conclusion, the world of data analytics is constantly evolving, and it is crucial for professionals to stay updated on the latest trends and developments in the field. Attending conferences is one of the most effective ways to stay ahead of the game and enhance your knowledge and skills.  

The 2023 data analytics conferences listed in this blog are some of the most highly regarded events in the industry, bringing together experts and practitioners from all over the world. Whether you are a seasoned data analyst, a new entrant in the field, or simply looking to expand your network, these conferences offer a wealth of opportunities to learn, network, and grow.

So, start planning and get ready to attend one of these top conferences in 2023 to stay ahead of the curve. 


March 2, 2023

Have you ever heard a story told with numbers? That’s the magic of data storytelling, and it’s taking the world by storm. If you’re ready to captivate your audience with compelling data narratives, you’ve come to the right place.

what is data storytelling
What is data storytelling – Detailed analysis by Data Science Dojo


Everyone loves data—it’s the reason your organization is able to make informed decisions on a regular basis. With new tools and technologies becoming available every day, it’s easy for businesses to access the data they need rather than search for it. Unfortunately, this also means that increasingly people are seeing the ins and outs of presenting data in an understandable way.

The rise in social media has allowed people to share their experiences with a product or service without having to look them up first. As a result, businesses are being forced to present data in a more refined way than ever before if they want to retain customers, generate leads, and retain brand loyalty. 

What is data storytelling? 

Data storytelling is the process of using data to communicate the story behind the numbers—and it’s a process that’s becoming more and more relevant as more people learn how to use data to make decisions. In the simplest terms, data storytelling is the process of using numerical data to tell a story. A good data story allows a business to dive deeper into the numbers and delve into the context that led to those numbers.

For example, let’s say you’re running a health and wellness clinic. A patient walks into your clinic, and you diagnose that they have low energy, are stressed out, and have an overall feeling of being unwell. Based on this, you recommend a course of treatment that addresses the symptoms of stress and low energy. This data story could then be used to inform the next steps that you recommend for the patient.   

Why is data storytelling important in three main fields: Finance, healthcare, and education? 

Finance – With online banking and payment systems becoming more common, the demand for data storytelling is greater than ever. Data can be used to improve a customer journey, improve the way your organization interacts with customers, and provide personalized services. Healthcare – With medical information becoming increasingly complex, data storytelling is more important than ever. In education – With more and more schools turning to data to provide personalized education, data storytelling can help drive outcomes for students. 


The importance of authenticity in data storytelling 

Authenticity is key when it comes to data storytelling. The best way to understand the importance of authenticity is to think about two different data stories. Imagine that in one, you present the data in a way that is true to the numbers, but the context is lost in translation. In the other example, you present the data in a more simplified way that reflects the situation, but it also leaves out key details. This is the key difference between data storytelling that is authentic and data storytelling that is not.

As you can imagine, the data store that is not authentic will be much less impactful than the first example. It may help someone, but it likely won’t have the positive impact that the first example did. The key to authenticity is to be true to the facts, but also to be honest with your readers. You want to tell a story that reflects the data, but you also want to tell a story that is true to the context of the data. 


Register for our conferenceFuture of Data and AI to learn from esteemed leaders and discover how to put data storytelling into action. Don’t miss out!


How to do data storytelling in action?

Start by gathering all the relevant data together. This could include figures from products, services, and your business as a whole; it could also include data about how your customers are currently using your product or service. Once you have your data together, you’ll want to begin to create a content outline.

This outline should be broken down into paragraphs and sentences that will help you tell your story more clearly. Invest time into creating an outline that is thorough but also easy for others to follow.

Next, you’ll want to begin to find visual representations of your data. This could be images, infographics, charts, or graphs. The visuals you choose should help you to tell your story more clearly.

Once you’ve finished your visual content, you’ll want to polish off your data stories. The last step in data storytelling is to write your stories and descriptions. This will give you an opportunity to add more detail to your visual content and polish off your message. 


The need for strategizing before you start 

While the process of data storytelling is fairly straightforward, the best way to begin is by strategizing. This is a key step because it will help you to create a content outline that is thorough, complete, and engaging. You’ll also want to strategize by thinking about who you are writing your stories for. This could be a specific section of your audience, or it could be a wider audience. Once you’ve identified your audience, you’ll want to think about what you want to achieve.

This will help you to create a content outline that is targeted and specific. Next, you’ll want to think about what your content outline will look like. This will help you to create a content outline that is detailed and engaging. You’ll also want to consider what your content outline will include. This will help you to ensure that your content outline is complete, and that it includes everything you want to include. 

Planning your content outline 

There are a few key things that you’ll want to include in your content outline. These include audience pain points, a detailed overview of your content, and your strategy. With your strategy, you’ll want to think about how you plan to present your data. This will help you to create a content outline that is focused, and it will also help you to make sure that you stay on track. 

Watch this video to know what your data tells you


Researching your audience and understanding their pain points 

With the planning complete, you’ll want to start to research your audience. This will help you to create a content outline that is more focused and will also help you to understand your audience’s pain points. With pain points in mind, you’ll want to create a content outline that is more detailed, engaging, and honest. You’ll also want to make sure that you’re including everything that you want to include in your content outline.   

Next, you’ll want to start to research your pain points. This will help you to create a content outline that is more detailed and engaging. 

Before you begin to create your content outline, you’ll want to start to think about your audience. This will help you to make connections and to start creating your content outline. With your audience in mind, you’ll want to think about how to present your information. This will help you to create a content outline that is more detailed, engaging, and focused. 

The final step in creating your content outline is to decide where you’re going to publish your data stories. If you’re going to publish your content on a website, you should think about the layout that you want to use. You’ll want to think about the amount of text and the number of images you want to include. 


The need for strategizing before you start 

Just as a good story always has a beginning, a middle, and an end, so does a good data story. The best way to start is by gathering all the relevant data together and creating a content outline. Once you’ve done this, you can begin to strategize and make your content more engaging, and you’ll want to make sure that you stay on track. 


Mastering your message: How to create a winning content outline

The first thing that you’ll want to think about when it comes to planning your content outline is your strategy. This will help you to make sure that you stay on track with your content outline. Next, you’ll want to think about your audience’s pain points. This will help you to make sure that you stay focused on the most important aspects of your content.  


Researching your audience and understanding their pain points 

The final thing that you’ll want to do before you begin to create your content outline is to research your audience. This will help you to make sure that you stay focused on the most important aspects of your content. With pain points in mind, you’ll want to make sure that you stay focused on the most important aspects of your content.  

Next, you’ll want to start to research your audience. This will help you to make sure that you stay focused on the most important aspects of your content. 

By approaching data storytelling in this way, you should be able to create engaging, detailed, and targeted content. 


The bottom line: What we’ve learned

In conclusion, data storytelling is a powerful tool that allows businesses to communicate complex data in a simple, engaging, and impactful way. It can help to inform and persuade customers, generate leads, and drive outcomes for students. Authenticity is a key component of effective data storytelling, and it’s important to be true to the facts while also being honest with your readers.

With careful planning and a thorough content outline, anyone can create powerful and effective data stories that engage and inspire their audience. As data continues to play an increasingly important role in decision-making across a wide range of industries, mastering the art of data storytelling is an essential skill for businesses and individuals alike.

February 21, 2023

In this blog, we will discuss what Data Analytics RFP is and the five steps involved in the data analytics RFP process.


December 1, 2022

In this article, we’re going to talk about how data analytics can help your business generate more leads and why you should rely on data when making decisions regarding a digital marketing strategy. 

Some people believe that marketing is about creativity – unique and interesting campaigns, quirky content, and beautiful imagery. Contrary to their beliefs, data analytics is what actually powers marketing – creativity is simply a way to accomplish the goals determined by analytics. 

Now, if you’re still not sure how you can use data analytics to generate more leads, here are our top 10 suggestions. 

1. Know how your audience behaves

Most businesses have an idea or two about who their target audience is. But having an idea or two is not good enough if you want to grow your business significantly – you need to be absolutely sure who your audience is and how they behave when they come to your website. 

Now, the best way to do that is to analyze the website data.  

You can tell quite a lot by simply looking at the right numbers. For instance, if you want to know whether the users can easily find the information they’re looking for, keep track of how much time they spend on a certain webpage. If they leave the webpage as soon as it loads, they probably didn’t find what they needed. 

We know that looking at spreadsheets is a bit boring, but you can easily obtain Power BI Certification and use Microsoft Power BI to make data visuals that are easy to understand and pleasing to the eye. 





Data analytics books
Books on Data Analytics – Compilation by Data Science Dojo

Read the top 12 data analytics books to learn more about it


2. Segment your audience

A great way to satisfy the needs of different subgroups within your target audience is to use audience segmentation. Using that, you can create multiple funnels for the users to move through instead of just one, thereby increasing your lead generation. 

Now, before you segment your audience, you need to have enough information about these subgroups so that you can divide them and identify their needs. Since you can’t individually interview users and ask them for the necessary information, you can use data analytics instead. 

Once you have that, it’s time to identify their pain points and address them differently for different subgroups, and voilàa – you’ve got yourself more leads. 

3. Use data analytics to improve buyer persona

Knowing your target audience is a must but identifying a buyer persona will take things to the next level. A buyer persona doesn’t only contain basic information about your customers. It goes deeper than that and tells you their exact age, gender, hobbies, location, and interests.  

It’s like describing a specific person instead of a group of people. 

Of course, not all your customers will fit that description to a T, but that’s not the point. The point is to have that one idea of a person (or maybe two or three buyer personas) in your mind when creating content for your business.  

buyer persona - Data analytics
Understanding buyer persona with the help of Data analytics  [Source: Freepik] 


4. Use predictive marketing 

While data analytics should absolutely be used in retrospectives, there’s another purpose for the information you obtain through analytics – predictive marketing. 

Predictive marketing is basically using big data to develop accurate forecasts of customers’ behavior. It uses complex machine-learning algorithms to build predictive models. 

A good example of how that works is Amazon’s landing page, which includes personalized recommendations.  

Amazon doesn’t only keep track of the user’s previous purchases, but also what they have clicked on in the past and the types of items they’ve shown interest in. By combining that with the season of purchase and time, they are able to make recommendations that are nearly 100% accurate. 

lead generation
Acquiring customers – Lead generation


If you’re curious to find out how data science works, we suggest that you enroll in the Data Science Bootcamp


5. Know where website traffic comes from 

Users come to your website from different places.  

Some have searched for it directly on Google, some have run into an interesting blog piece on your website, while others have seen your ad on Instagram. This means that the time and effort you put into optimizing your website and creating interesting content pays off. 

But imagine creating a YouTube ad that doesn’t bring much traffic – that doesn’t pay off at all. You’d then want to rework your campaign or redirect your efforts elsewhere.  

This is exactly why knowing where website traffic comes from is valuable. You don’t want to invest your time and money into something that doesn’t bring you any benefits. 

6. Understand which products work 

Most of the time, you can determine what your target audience will like and dislike. The more information you have about your target audience, the better you can satisfy their needs.  

But no one is perfect, and anyone can make a mistake. 

Heinz, a company known for producing ketchup and other food, once released their new product: EZ Squirt ketchup in shades of purple, green, and blue. At first, the kids loved it, but this didn’t last for long. Six years later after that, Heinz halted production of these products. 

As you can see, even big and experienced companies flop sometimes. A good way to avoid that is by tracking which product pages have the least traffic and don’t sell well. 

7. Perform competitor analysis 

Keeping an eye on your competitors is never a bad idea. No matter how well you’re doing and how unique you are, others will try to surpass you and become better. 

The good news is that there are quite a few tools online that you can use for competitor analysis. SEMrush, for instance, can help you see what the competition is doing to get qualified leads so that you can use it to your advantage. 

Even if there wasn’t a tool you need, you can always enroll in a Python for Data Science course and learn to build your own tools that can track the data you need to drive your lead generation. 

competitor analysis - data analytics
Performing competitor analysis through data analytics [Source: Freepik] 

8. Nurture your leads

Nurturing your leads means developing a personalized relationship with your prospects at every stage of the sales funnel in order to get them to buy your products and become your customers. 

Because lead nurturing offers a personalized approach, you’ll need information about your leads: what is their title, role, industry, and similar info, depending on what your business does. Once you have that, you can provide them with the relevant content that will help them decide to buy your products and build brand loyalty along the way. 

This is something b2b lead generation companies can help you with if you’re hesitant to do it on your own.  

9. Gain more customers

Having an insight into your conversion rate, churn rate, sources of website traffic, and other relevant data will ultimately lead to more customers. For instance, your sales team will be able to calculate which sources convert most effectively and prepare resources before running a campaign. 

The more information you have, the better you’ll perform, and this is exactly why Data Science for Business is important – you’ll be able to see the bigger picture and make better decisions. 

data analysts performing data analysis of customer's data
Data analysts performing data analysis of customer’s data

10. Avoid significant losses 

Finally, data can help you avoid certain losses by halting the launch of a product that won’t do well. 

For instance, you can use the Coming soon page to research the market and see if your customers are interested in a new product you planned on launching. If enough people show interest, you can start producing, and if not – you won’t waste your money on something that was bound to fail. 



Applications of data analytics go beyond simple data analysis, especially for advanced analytics projects. The majority of the labour is done up front in the data collection, integration, and preparation stages, followed by the creation, testing, and revision of analytical models to make sure they give reliable findings. Data engineers, who build data pipelines and aid in the preparation of data sets for analysis, are frequently included within analytics teams in addition to data scientists and other data analysts.

November 17, 2022

Data Science Dojo is offering Metabase for FREE on Azure Marketplace packaged with web accessible Metabase: Open-Source server. 

Metabase query
Metabase query



Organizations often adopt strategies that enhance the productivity of their selling points. One strategy is to utilize the prior business data to identify key patterns regarding any product and then take decisions for it accordingly. However, the work is quite hectic, costly, and requires domain experts. Metabase has bridged that gap of skillset. Metabase provides marketing and business professionals with an easy-to-use query builder notebook to extract required data and simultaneously visualize it without any SQL coding, with just a few clicks. 

What is Metabase and its question? 

Metabase is an open-source business intelligence framework that provides a web interface to import data from diverse databases and then analyze and visualize it with few clicks. The methodology of Metabase is based on questions and the answers to them. They form the foundation of everything else that it provides. 


A question is any kind of query that you want to perform on a data. Once you are done with the specification of query functions in the notebook editor, you can visualize the query results. After that you can save this question as well for reusability and turn it into a data model for business specific purposes. 

Pro Tip: Join our 6-months instructor-led Data Science Bootcamp to become expert at data science & analytics skillset 

Challenges for businesses  

For businesses that lack expert analysts, engineers and substantial IT department, it was costly and time-consuming to hire new domain experts or managers themselves learn to code and then explore and visualize data. Apart from that, not many pre-existing applications provide diverse data source connections which was also a challenge. 

In this regard, a straightforward interactive tool that even newbies could adapt immediately and thus get the job done would be the most ideal solution. 

Data analytics with Metabase  

Metabase concept is based on questions which are basically queries and data models (special saved questions). It provides an easy-to-use notebook through which users can gather raw data, filter it, join tables, summarize information, and add other customizations without any need for SQL coding.

Users can select the dimensions of columns from tables and then create various visualizations and embed them in different sub-dashboards. Metabase is frequently utilized for pitching business proposals to executive decision-makers because the visualizations are very simple to achieve from raw data. 


visualization on sample data
Figure 1: A visualization on sample data 


A visualization on sample data 
Figure 2:  Query builder notebook


Major characteristics 

  • Metabase delivers a notebook that enables users to select data, join with other tables, filter, and other operations just by clicking on options instead of writing a SQL query 
  • In case of complex queries, a user can also use an in-built optimized SQL editor 
  • The choice to select from various data sources like PostgreSQL, MongoDB, Spark SQL, Druid, etc., makes Metabase flexible and adaptable 
  • Under the Metabase admin dashboard, users can troubleshoot the logs regarding different tasks and jobs 
  • Has the ability to enable public sharing. It enables admins to create publicly viewable links for Questions and Dashboards  

What Data Science Dojo has for you  

Metabase instance packaged by Data Science Dojo serves as an open-source easy-to-use web interface for data analytics without the burden of installation. It contains numerous pre-designed visualization categories waiting for data.

It has a query builder which is used to create questions (customized queries) with few clicks. In our service users can also use an in-browser SQL editor for performing complex queries. Any user who wants to identify the impact of their product from the raw business data can use this tool. 

Features included in this offer:  

  • A rich web interface running Metabase: Open Source 
  • A no-code query building notebook editor 
  • In-browser optimized SQL editor for complex queries 
  • Beautiful interactive visualizations 
  • Ability to create data models 
  • Email configuration and Slack support 
  • Shareability feature 
  • Easy specification for metrics and segments 
  • Feature to download query results in CSV, XLSX and JSON format 

Our instance supports the following major databases: 

  • Druid 
  • PostgreSQL 
  • MySQL 
  • SQL Server 
  • Amazon Redshift 
  • Big Query 
  • Snowflake 
  • Google Analytics 
  • H2 
  • MongoDB 
  • Presto 
  • Spark SQL 
  • SQLite 


Metabase is a business intelligence software and beneficial for marketing and product managers. By making it possible to share analytics with various teams within an enterprise, Metabase makes it simple for developers to create reports and collaborate on projects. The responsiveness and processing speed are faster than the traditional desktop environment as it uses Microsoft cloud services. 

At Data Science Dojo, we deliver data science education, consulting, and technical services to increase the power of data. We are therefore adding a free Metabase server dedicated specifically for Data Analytics operations on Azure Market Place. Hurry up and install this offer by Data Science Dojo, your ideal companion in your journey to learn data science!  

Click on the button below to head over to the Azure Marketplace and deploy Metabase for FREE by clicking on “Get it now”. 

CTA - Try now

Note: You’ll have to sign up to Azure, for free, if you do not have an existing account. 

November 5, 2022

From customer relationship management to tracking analytics, marketing analytics tools are important in the modern world. Learn how to make the most of these tools.

What do you usually find in a toolbox? A hammer, screwdriver, nails, tape measure? If you’re building a bird house, these would be perfect for you, but what if you’re creating a marketing campaign? What tools do you want at your disposal? It’s okay if you can’t come up with any. We’re here to help.

Industry’s leading marketing analytics tools

These days marketing is all about data. Whether it’s a click on an email or an abandoned cart on Amazon, marketers are using data to better cater to the needs of the consumer. To analyze and use this data, marketers have a toolbox of their own.

So what are some of these tools and what do they offer? Here, at Data Science Dojo, we’ve come up with our top 5 marketing analytics tools for success:

Customer relationship management platform (CRM)

CRM is a tool used for managing everything there is to know about the customer. It can track where/when a consumer visits your site, tracks the interactions on your site, and creates profiles for leads. A few examples of CRMs are:

HubSpot logo
HubSpot logo

HubSpot, along with the two others listed above, took the idea of a CRM and made it into an all-inclusive marketing resort. Along with the traditional CRM uses, HubSpot can be used to:

  • Manage social media
  • Send mass email campaigns
  • View traffic, campaign, and customer analytics
  • Associate emails, blogs, and social media posts to specific marketing campaigns
  • Create workflows and sequences
  • Connect to your other analytics tools such as Google Analytics, Facebook Ads, YouTube, and Slack.

HubSpot continues its effectiveness by creating reports allowing its users to analyze what is and isn’t working.

This is just a brief description revealing the tip of the iceberg of what HubSpot does. If you want to see below the water line, visit its website.

Search software

Search engine optimization (SEO) is the process of a website ranking on search engines. It’s how you can find everything you have ever searched for on Google. Search software helps marketers analyze how to best optimize websites for potential consumers to find.

A few search software companies are:

I would love to describe each one of the above businesses, but I only have experience with Moz. Moz focuses on a “less invasive way (of marketing) where customers are earned rather than bought”.

Its entire business is focused on upgrading your SEO. Moz offers 9 different services through its Moz Pro toolkit:

Moz Pro Services
Moz Pro Services

I love Moz Keyword Explorer. This is the tool I use to check different variations of titles, keywords, phrases, and hashtags. It gives four different scores, which you can see in the photo below.

Moz Keyword Explorer
Moz Keyword Explorer

Now, there’s not enough data to show the average monthly volume for my name, but, according to Moz, it wouldn’t be that difficult to rank higher than my competitors, people have a high likelihood of clicking, and the Priority explains that my name is not a “sweet spot” for high volume, low difficulty, and high CTR. In conclusion, using my name as a keyword to optimize the Data Science Dojo Blog isn’t the best idea.

Read more about marketing analytics in this blog

Web analytics service

We can’t talk about marketing tools and not to mention Web Analytics Services. These are one of the most important pieces of equipment in the marketer’s toolbox. Google Analytics (GA) is a free web analytics service that integrates your company’s website data into a meticulously organized dashboard. I wouldn’t say GA is the be-all and end-all piece of equipment, and there are many different services and tools out there, however, it can’t be refuted that Google Analytics is a great tool to integrate into your company’s marketing strategy.

Some similar Web Analytics Services include:

Google analytics logo
Google Analytics logo

Some of the analytics you’ll be able to understand are

  • Real-time data – Who’s on your site right now? Where are the users coming from? What pages are they looking at?
  • Audience Information – Where do your users live, age range, interests, gender, new or returning visitor, etc.?
  • Acquisition – Where did they come from (Organic, Direct, Paid Ads, Referrals, Campaigns)? What day/time do they land on your website? What was the final URL they visited before leaving? You can also link to any Google Ads campaigns you have running.
  • Behavior – What is the path people take to convert? How is your site speed? What events took place (Contact form submission, newsletter signup, social media share)?
  • Conversions – Are you attributing conversions by first touch, last touch, linear, or decay?

Understanding these metrics is amazingly effective in narrowing down how users interact with your website.

Another way to integrate Google Analytics into your marketing strategy is by setting up goals. Goals are set up to track specific actions taken on your website. For example, you can set up goals to track purchases, newsletter signups, video plays, live chat, and social media shares.

If you want a more in-depth look at what Google Analytics can offer, you can learn the basics through their Analytics Academy.

marketing analytics tool
Google analysis feedback

Analysis and feedback platform (A&F)

A&Fs are another great piece of equipment in the marketer’s toolbox; more specifically for looking at how users are interacting on your website. One such A&F, HotJar, does this in the form of heatmaps and recordings. HotJar’s integrated tracking pixel allows you to see how far users scroll on your website and what items were clicked the most.

You can also watch recordings of a user’s experience and even filter down to the URL of the page you wish to track, (i.e. /checkout/). This allows you to capture the user’s unique journey until they make a purchase. For each recording, you can view audience information such as geographical location, country, browser, operating system, and a documented list of user actions.

In addition to UX/UI metrics, you can also integrate polls and forms on your website for more intricate data about your users.

As a marketing manager, these tools help to visualize all of my data in ways that a pivot table can’t display. And while I am a genuine user of these platforms, I must admit that it’s not the tool that makes the man, it’s the strategy. To get the most use out of these platforms, you will need to understand what business problem you are trying to solve and what metrics are important to you.

There is a lot of information that these dashboards can provide you. However, it’s up to you to filter through the noise. Not every accessible metric applies to you, so you will need to decide what is the most important for your marketing plan.

A few similar platforms include:

Experimentation platforms

Experimentation platforms are software for experimenting with different variations of a sample. Its purpose is to run A/B tests, something HubSpot does, but these platforms dive head first into them.

Experimentation Platforms
Experimentation Platforms

Where HubSpot only tests versions A and B, experimentation platforms let you test versions A, B, C, D, E, F, etc. They don’t just test the different versions, they will also test different audiences and how they respond to each test version. Searching “definition experimentation platforms” is a good place to start in understanding what experimentation platforms are. I can tell you they are a dream come true for marketers who love to get their hands dirty in behavioral targeting.

Optimizely is one such example of a company offering in-depth A/B testing. Optimizely’s goal is to let you spend more time experimenting with the customer experience and less time wading through statistics to learn what works and what doesn’t. If you are unsure what to do, you can test it with Optimizely.

Using companies like Optimizely or Split is just one way to experiment. Many name brand companies like  Netflix,  MicrosofteBay, and Uber have all built their experimentation platforms to use internally.

Not perfect

No one toolbox is perfect, and everyone is going to be different. One piece of advice I can give is to always understand the problem before deciding which tool is best to solve the problem. You wouldn’t use a hammer to do a job where a drill would be more effective, right?

Top 5 marketing analytics tools for success | Data Science Dojo

You could, it just wouldn’t be the most efficient method. The same concept goes for marketing. Understanding the problem will help you know which tools should be in your toolbox.

August 18, 2022

In this blog, we discussed the applications of AI in healthcare. We took a deep dive into an application of AI, and prognosis prediction using an exercise. We made a simple prognosis detector with an explanation of each step. Our predictor takes symptoms as inputs and predicts the prognosis using a classification model.

Introduction to prognosis prediction

The role of data science and AI (Artificial Intelligence) in the Healthcare industry is not limited to predicting and tracking disease spread. Now, it has become possible to learn the causes of whatever symptoms you are experiencing, such as cough, fever, and body pain, without visiting a doctor and self-treating it at home. Platforms like Ada Health and Sensely can diagnose the symptoms you report.

If you have not already, please go back and read AI & Healthcare. If you have already read it, you will remember I wrote, “Predictive analysis, using historical data to find patterns and predict future outcomes can find the correlation between symptoms, patients’ habits, and diseases to derive meaningful predictions from the data.”

This tutorial will do just that: Predict the prognosis with symptoms as our input.

Exercise: Predict prognosis using symptoms as input

Prognosis Prediction Process
Prognosis Prediction Process

Import required modules

Let us start by importing all the libraries needed in the exercise. We import pandas as we will be reading CSV files as Data Frame. We are importing Label Encoder from sklearn.preprocessing package. Label Encoder is a utility class to convert non-numerical labels to numerical labels. In this exercise, we predict prognosis using symptoms, so it is a classification task.

We are using RandomForestClassifier, which consists of many individual decision trees that work as an ensemble. Learn more about RandomForestClassifier by enrolling in our Data Science Bootcamp, a remote instructor-led Bootcamp. We also require classification reports and accuracy score metrics to measure the model’s performance.

import numpy as np
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report, accuracy_score

Read CSV files

We are using this Kaggle dataset for our exercise.

It has two files, Training.csv and Testing.csv, containing training and testing data, respectively. You can download these files by going to the data section of the above link.

Read CSV files into Data Frame using pandas read_csv() function. It reads comma-separated files at supplied file path into DataFrame. It takes a file path as a parameter, so provide the right file path where you have downloaded the files.

train = pd.read_csv("File path of Training.csv")
test = pd.read_csv("File path of Testing.csv")

Check samples of the training dataset

To check what the data looks like, let us grab the first five rows of the DataFrame using the head() function.

We have 133 features. We want to predict prognosis so that it would be our target variable. The rest of the 132 features are symptoms that a person experience. The classifier would use these 132 symptoms feature to predict prognosis.

data frame
Head Data frame

The training set holds 4920 samples and 133 features, as shown by the shape attribute of the DataFrame.

(4920, 133)

Descriptive analysis

Description of the data in the DataFrame can be seen by describe() method of the DataFrame. We see no missing values in our DataFrame as the count of all the features is 4920, which is also the number of samples in our DataFrame. We also see that all the numeric features are binary and have a value of either 1 or 0.

Describe data frame
Describe data frame
data frame objects
Describe data frame objects

Our target variable prognosis has 41 unique values, so there are 41 diseases in which the model will classify input. There are 120 samples for each unique prognoses in our dataset.

Prognosis Column
Value Count of Prognosis Column

There are 132 symptoms in our dataset. The names of the symptoms will be listed if we use this code block.

possible_symptoms = train[train.columns.difference(['prognosis'])].columnsprint(list(possible_symptoms))

['abdominal_pain', 'abnormal_menstruation', 'acidity', 'acute_liver_failure', 'altered_sensorium', 'anxiety', 'back_pain', 'belly_pain', 'blackheads', 'bladder_discomfort', 'blister', 'blood_in_sputum', 'bloody_stool', 'blurred_and_distorted_vision', 'breathlessness', 'brittle_nails', 'bruising', 'burning_micturition', 'chest_pain', 'chills', 'cold_hands_and_feets', 'coma', 'congestion', 'constipation', 'continuous_feel_of_urine', 'continuous_sneezing', 'cough', 'cramps', 'dark_urine', 'dehydration', 'depression', 'diarrhoea', 'dischromic _patches', 'distention_of_abdomen', 'dizziness', 'drying_and_tingling_lips', 'enlarged_thyroid', 'excessive_hunger', 'extra_marital_contacts', 'family_history', 'fast_heart_rate', 'fatigue', 'fluid_overload', 'fluid_overload.1', 'foul_smell_of urine', 'headache', 'high_fever', 'hip_joint_pain', 'history_of_alcohol_consumption', 'increased_appetite', 'indigestion', 'inflammatory_nails', 'internal_itching', 'irregular_sugar_level', 'irritability', 'irritation_in_anus', 'itching', 'joint_pain', 'knee_pain', 'lack_of_concentration', 'lethargy', 'loss_of_appetite', 'loss_of_balance', 'loss_of_smell', 'malaise', 'mild_fever', 'mood_swings', 'movement_stiffness', 'mucoid_sputum', 'muscle_pain', 'muscle_wasting', 'muscle_weakness', 'nausea', 'neck_pain', 'nodal_skin_eruptions', 'obesity', 'pain_behind_the_eyes', 'pain_during_bowel_movements', 'pain_in_anal_region', 'painful_walking', 'palpitations', 'passage_of_gases', 'patches_in_throat', 'phlegm', 'polyuria', 'prominent_veins_on_calf', 'puffy_face_and_eyes', 'pus_filled_pimples', 'receiving_blood_transfusion', 'receiving_unsterile_injections', 'red_sore_around_nose', 'red_spots_over_body', 'redness_of_eyes', 'restlessness', 'runny_nose', 'rusty_sputum', 'scurring', 'shivering', 'silver_like_dusting', 'sinus_pressure', 'skin_peeling', 'skin_rash', 'slurred_speech', 'small_dents_in_nails', 'spinning_movements', 'spotting_ urination', 'stiff_neck', 'stomach_bleeding', 'stomach_pain', 'sunken_eyes', 'sweating', 'swelled_lymph_nodes', 'swelling_joints', 'swelling_of_stomach', 'swollen_blood_vessels', 'swollen_extremeties', 'swollen_legs', 'throat_irritation', 'toxic_look_(typhos)', 'ulcers_on_tongue', 'unsteadiness', 'visual_disturbances', 'vomiting', 'watering_from_eyes', 'weakness_in_limbs', 'weakness_of_one_body_side', 'weight_gain', 'weight_loss', 'yellow_crust_ooze', 'yellow_urine', 'yellowing_of_eyes', 'yellowish_skin']

There are 41 unique prognoses in our dataset. The name of all prognoses will be listed if we use this code block:

['Fungal infection','Allergy','GERD','Chronic cholestasis','Drug Reaction','Peptic ulcer diseae','AIDS','Diabetes ','Gastroenteritis','Bronchial Asthma','Hypertension ','Migraine','Cervical spondylosis','Paralysis (brain hemorrhage)','Jaundice','Malaria','Chicken pox','Dengue','Typhoid','hepatitis A','Hepatitis B','Hepatitis C','Hepatitis D','Hepatitis E','Alcoholic hepatitis','Tuberculosis','Common Cold','Pneumonia','Dimorphic hemmorhoids(piles)','Heart attack','Varicose veins','Hypothyroidism','Hyperthyroidism','Hypoglycemia','Osteoarthristis','Arthritis','(vertigo) Paroymsal  Positional Vertigo','Acne','Urinary tract infection','Psoriasis','Impetigo']

Data visualization

new_df = train[train.columns.difference(['prognosis'])]
#Maximum Symptoms present for a Prognosis are 17
Minimum Symptoms present for a Prognosis are 3
series = new_df.sum(axis=0).nlargest(n=15)
pd.DataFrame(series, columns=["Occurance"]).loc[::-1, :].plot(kind="barh")
bar chart
Horizontal bar chart for Occurrence column

Fatigue and vomiting are the symptoms most often seen.

Encode object prognosis

Our target variable is categorical features. Let us create an instance of Label Encoder and fit it with the prognosis column of train data and test data. It will encode all possible categorical values in numerical values.

label_encoder = LabelEncoder()
label_encoder.fit(pd.concat([train['prognosis'], test['prognosis']]))

It concludes the data preparation step. Now, we can move on to model training with this data.

Training and evaluating model

Let us train a RandomForestClassifier with the prepared data. We initialize RandomForestClassifier, fit the features and label in it then finally make a prediction on our test data.

In the end, we transform label encoded prognosis values back to the original form using the fit_transform() method of the LabelEncoder object.

random_forest = RandomForestClassifier()
random_forest.fit(train[train.columns.difference(['prognosis'])], label_encoder.fit_transform(train['prognosis']))
y_pred = random_forest.predict(test[test.columns.difference(['prognosis'])])
y_true = label_encoder.fit_transform(test['prognosis'])
print("Accuracy:", accuracy_score(y_true, y_pred))
print(classification_report(y_true, y_pred, target_names=test['prognosis']))
Classification report
Classification report

Predict prognosis by taking symptoms as input

We have our model trained and ready to make predictions. We need to create a function that takes symptoms as input and predicts the prognosis as output. The function predict_prognosis() below is just doing that.

We take input features as a string of symptoms separated by space. We strip the string to remove spaces at the beginning and end of the string. We split this string and created a list of symptoms. We cannot use this list directly in the model for prediction as it contains symptoms’ names, but our model takes a list of 0 and 1 for the absence and presence of symptoms. Finally, with the features in the desired form, we predict the prognosis and print the predicted prognosis.

def predict_prognosis():
  print("List of possible Symptoms you can enter: ", list(train[train.columns.difference(['prognosis'])].columns))
  input_symptoms = list(input("\nEnter symptoms space separated: ").strip().split())
  test_value = []
  for symptom in train[train.columns.difference(['prognosis'])].columns:
    if symptom in input_symptoms:
    np_test = np.array(test_value).reshape(1, -1)
    encoded_label = random_forest.predict(np_test)
  predicted_label = label_encoder.inverse_transform(encoded_label)[0]
  print("Predicted Prognosis: ", predicted_label)

Give input symptoms:

Effective prognosis prediction | Data Science Dojo

Predicted prognoses

Suppose we have these symptoms abdominal pain, acidity, anxiety, and fatigue. To predict prognosis, we must enter the symptoms in comma separate fashion. The system will separate the symptoms, transform them into a form model that can predict and finally output the prognosis.
Output prognosis
Output prognosis


To sum up, we discussed the applications of AI in healthcare. Took a deep dive into an application of AI, and prognosis prediction using an exercise. Created a prognosis predictor with an explanation of each step. Finally, we tested our predictor by giving it input symptoms and got the prognosis as output.

Full Code Available!

August 18, 2022

Learning data analytics is a challenge for beginners. Take your learning experience of data analytics one step ahead with these twelve data analytics books. Explore a range of topics, from big data to artificial intelligence.


Data analytics books
Books on Data Analytics

Data Analytics Books

1. Data Science for Business: What You Need to Know about Data Mining and Data-Analytic Thinking by Foster Provost and Tom Fawcett

This book is written by two globally esteemed data science experts who introduce their readers to the fundamental principles of data science and then dig deep into the important role data plays in business-related decision-making. They do a great job of demonstrating different techniques and ideas related to analytical thinking without getting into too many technicalities.

Through this book, you can not only begin to appreciate the importance of communication between business strategists and data scientists but can also discover how to approach business problems analytically to generate value.

2. The Data Science Design Manual (Texts in Computer Science) eBook: S. Skiena, Steven: Books

To survive in a data-driven world, we need to adopt the skills necessary to analyze datasets acquired. Data Science is critical to statistics, data visualization, machine learning, and mathematical modeling, Steven in this book give an overview of data science introduction for beginners in this emerging discipline.

The second part of the book highlights the essential skills, knowledge, and principles required to collect, analyze and interpret data. This book leaves learners spellbound with its step-by-step guidance to develop an inside-out theoretical and practical understanding of data science.

The Data Science Design Manual is a thorough instructor guide for learners eager to kick off their learning journey in Data Science. Lastly, Steven added the application of data science in the world, a wide range of exercises, Kaggle challenges, and most interestingly the examples from a data science show, The Quant Shop to excite the learners. 

3. Data Analytics Made Accessible by Anil Maheshwari

Are you a data enthusiast looking to finally dip your toes in the field? Start with Data Analytics Made Accessible by Anil Maheshwari.  Get a sense of what data analytics is all about and how significant a role it plays in real-world scenarios with this informative, easy-to-follow read.

In fact, this book is considered such a vital resource that numerous universities across the globe have added it to their required textbooks list for their analytics courses. It sheds light on the relationship between business and data by talking at length about business intelligence, data mining, and data warehousing.  

4. Python for Data Analysis by Wes McKinney

Written by the main author of the Pandas library, Python for Data Analysis is a book that spells out the basics of manipulating, processing, cleaning, and crunching data in Python. It is a hands-on book that walks its readers through a broad set of real-world case studies and enables them to solve different types of data analysis problems. 

It introduces different data science tools in Python to the readers in order to get them started on loading, cleaning, transforming, merging, and reshaping data. It also walks you through creating informative visualizations using Matplotlib. 

5. Big Data: A Revolution That Will Transform How We Live, Work, and Think by Viktor Mayer-Schönberger and Kenneth Cukier

This book is tailor-made for those who want to know the significance of data analytics across different industries. In this work, these two renowned domain experts bring the buzzword ‘big data’ under the limelight and try to dissect how it’s impacting our world and changing our lives, for better or for worse. 

It does not delve into the technical aspects of data science algorithms or applications, rather it’s more of a theoretical primer on what big data really is and how it’s becoming central to different walks of life. Apart from encouraging the readers to embrace this ground-breaking technological development, it also reminds them of the potential digital hazards it poses and how we can protect ourselves from them.

6. Business Unintelligence: Insight and Innovation beyond Analytics and Big Data by Barry Devlin

This book is great for someone who is looking to read through the past, present, and future of business intelligence. Highlighting the great successes and overlooked weaknesses of traditional business intelligence processes, Dr. Devlin delves into how analytics and big data have transformed the landscape of modern-day business intelligence. 

It identifies the tried-and-tested business intelligence practices and provides insights into how the trinity of information, people, and process conjoin to generate competitive advantage and drive business success in this rapidly advancing world. Furthermore, in this book, Dr. Delvin recommends several new models and frameworks that businesses and companies can employ for an even better tomorrow.

Join our Data Science Bootcamp today to start your career in the world of data.

7. Storytelling with Data: A Data Visualization Guide for Business Professionals by Cole Nussbaumer Knaflic

Globally, the culture is visual. Everything we consume from art, and advertisements to TV is visual. Data visualization is the art of narrating stories with a purpose. In this book, Knaflic highlights key points to effectively tell a story backed by data. The book journeys through the importance of situating your data story within a context, guides on the most suitable charts, graphs, and maps to spot trends and outliers, and discusses how to declutter and retain focus on the key points. 

This book is a valuable addition for anyone eager to grasp the basic concepts of data communication. Once you finish reading the book, you will gain a general understanding of several graphs that add a spark to the stories you create from data. Knaflic instills in you the knowledge to tell a story with an impact.

Learn about lead generation through data analytics in this blog

10 ways data analytics can help you generate more leads 


8. Developing Analytic Talent: Becoming a Data Scientist by Vincent Granville

Granville leveraged his lifetime’s experience of working with big data, business analytics, and predictive modeling to compose a “handbook” on data science and data scientists. In this book, you will find learnings that are rarely found in traditional statistical, programming, or computer science textbooks as the author writes from experiential knowledge rather than theoretical. 

Moreover, this book covers all the most valuable information to help you excel in your career as a data scientist. It talks about how data science came to the fore in recent times and became indispensable for organizations using big data. 

The book is divided into three components:

  • What is data science and how does it relate to other disciplines
  • Data science technical applications along with tutorials and case studies
  • Career resources for future and practicing data scientists

This data science book also helps decision-makers to build a better analytics team by informing them about specialized solutions and their uses. Lastly, if you plan to launch a startup around data science, giving this book a reader will give you an edge with some quick ideas based on 20+ industrial experience in Granville.

9. Learning R: A Step-By-Step Function Guide to Data Analysis by Richard Cotton

Non-technical users are scared off by programming languages. This book is an asset for all non-tech learners of the R language. The author compiled a list of tools that make access to statistical models much easier. This book, step-by-step, introduces the reader to R without digging into the details of statistics and data modeling. 

The first part of this data science book introduces you to the basics of the R programming language. It discusses data structures, data environment, looping constructs, and packages. If you are already familiar with the basics you can begin with the second part of the book to learn the steps involved in data analysis like loading, cleaning, and transforming data. The second part of the book gives more insight to perform exploratory analysis and modeling.

10. Data Analytics: A Comprehensive Beginner’s Guide to Learn About the Realms of Data Analytics From A-Z by Benjamin Smith

Smith pens down the path to learning data analytics from A to Z in easy-to-understand language. The book offers simplified explanations for challenging topics like sophisticated algorithms, or even the Euclidean Square Estimate. At any point, while reading this book, you will not feel overwhelmed by technical jargon or menacing formulas. 

First, quickly after introducing the topic, the author then explains a real-world use case and then brings forth the technical jargon. Smith demonstrates almost every practical topic with the use of Python, to enable learners to recreate the projects by themselves. The handy tips and practical exercises are a bonus. 

11. Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing, and Presenting Data by EMC Education Services

With the implementation of Big Data analytics, you explore greater avenues to investigate and generate authentic outcomes to support businesses. It instigates deeper insights that were previously not conveniently doable for everyone. Readers of Data Science and Big Data Analytics perform integration with real-time feeds and queries of structured and unstructured data. As you progress with the chapters in this book, you will open new paths to insight and innovation.

EMC Education Services in this book introduced some of the key techniques and tools suggested by the practitioners for Big Data analytics. Mastering the tools upholds an opportunity of becoming an active contributor to the challenging projects of Big Data analytics. This data science book consists of twelve chapters, crafting a reader’s journey from the Basics of Big Data analytics toward a range of advanced analytical methods, including classification, regression analysis, clustering time series, and text analysis.

All these lessons speak to assist multiple stakeholders which include business and data analysts looking to add Big Data analytics skills to their portfolio; database professionals and managers of business intelligence, analytics, or Big Data groups looking to enrich their analytic skills; and college graduates investigating data science as a career field

12. An Introduction to Statistical Methods and Data Analysis by Lyman Ott

Lyman Ott discussed the powerful techniques used in statistical analysis for both advanced undergraduate and graduate students. This book helps students with solutions to solve problems encountered in research projects. Not only does it greatly benefit students in decision making but it also allows them to become critical readers of statistical analyses. The book gained positive feedback from different levels of learners because it presumes the readers to have little or no mathematical background, thus explaining the complex topics in an easy-to-understand way.

Ott extensively covered the introductory statistics in the starting 11 chapters. The book also targets students who struggle to ace their undergraduate capstone courses. Lastly, it provides research studies and examples that connect the statistical concepts to data analysis problems.

Upgrade your data science skillset with our Python for Data Science training!

August 17, 2022

From customer relationship management to tracking analytics, marketing tools are important in the modern world. Learn how to make the most of these tools.

What do you normally find in a toolbox? A hammer, screwdriver, nails, tape measure? If you’re building a bird house, these would be perfect for you, but what if you’re creating a marketing campaign? What tools do you want at your disposal? It’s okay if you can’t come up with any. We’re here to help.

These days marketing is all about data. Whether it’s a click on an email or an abandoned cart on Amazon, marketers are using data to better cater to the needs of the consumer. In order to analyze and use this data, marketers have a toolbox of their own.

So what are some of these tools and what do they offer? Here, at Data Science Dojo, we’ve come up with our top 5 marketing analytics tools for success.

Customer Relationship Management Platform (CRM)

CRM is a tool used for managing everything there is to know about the customer. It can track where/when a consumer visits your site, it tracks the interactions on your site, and creates profiles for leads. A few examples of CRMs are:


  • HubSpot, along with the two others listed above, took the idea of a CRM and made it into an all-inclusive marketing resort. Along with the traditional CRM uses, HubSpot can be used to:
  • Manage social media
  • Send mass email campaigns
  • View traffic, campaign, and customer analytics
  • Associate emails, blogs, and social media posts to specific marketing campaigns
  • Create workflows and sequences
  • Connect to your other analytics tools such as Google Analytics, Facebook Ads, Amazon seller competitor analysis, YouTube, and Slack.


HubSpot continues its effectiveness by creating reports allowing its users to analyze what is and isn’t working.

This is just a brief description revealing the tip of the iceberg of what HubSpot does. If you want see below the water line, visit its website.

Search Software

Search engine optimization (SEO) is the process of a website ranking on search engines. It’s how you are able to find everything you have ever searched for on Google. Search software helps marketers analyze how to best optimize websites for potential consumers to find.

A few search software companies are:


I would love to describe each one of the above businesses, but I only have experience with Moz. Moz focuses on a

“less invasive way (of marketing) where customers are earned rather than bought”.

In fact, its entire business is focused on upgraging your SEO. Moz offers 9 different services through its Moz Pro toolkit: