Artificial Intelligence

Generative AI – Understanding the ethics and societal impact of emerging trends
Ayesha Saleem
| March 31, 2023

Artificial intelligence (AI), machine learning (ML), and data science have become some of the most significant topics of discussion in today’s technological era. In one of the speakers’ sessions on the ‘Future of Data and AI’, several experts in these fields came together to discuss the latest advancements and how they are using them in their everyday work. 

Introduction of panelists 

The session starts with Hamza, a research science manager at Google, introducing himself and explaining how he runs a few ML models and helps build models that can predict user abuse. Hamza works in the trust and safety group within search, where they prioritize the protection of users. 

Generative AI: Trends, Ethics and Societal Impact – Watch the complete session  

The other experts introduce themselves as well. Batool, who has experience working as an AI scientist at amazon, focused on dialogue machines and natural language understanding.

Meanwhile, Francesca, a principal data scientist manager at Microsoft, leads teams of data scientists and ML scientists, working on internal problems at Microsoft. Raja, the founder, and chief data scientist at Data Science Dojo, has been working in data science before it was even called data science. 

Use of Generative AI 

The conversation then shifts to the use of generative AI, which has been used in the field of data science and ML for a while. Francesca explains that there are three main categories where generative AI is used every day in her work.  

The first is generating natural language, which includes summarization, translation, and question-answering systems. The second is an image and video generation, which has applications in industries like gaming and advertising. The third is generating music, which can be used for composing, arranging, and creating personalized music. 

A deeper understanding of the current state of the field 

The experts then discuss the latest advancements in these fields. Raja emphasizes the importance of the latest advancements in deep learning, specifically transformers, in NLP tasks. He also mentions the development of large-scale language models like GPT-3, which can perform tasks like translation, summarization, and question-answering. 

Matul discusses how chatbots have evolved from rule-based systems to data-driven systems, where they can use data to train and improve their performance. This includes using natural language processing to understand and respond to user queries more effectively. 

Francesca highlights the importance of democratizing AI and making it accessible to all people, regardless of their technical background. This involves developing user-friendly tools that can be used by people without technical expertise, which can be used to address common business problems. 

Generative AI – The impact of ground-breaking generative AI technologies 

Open AI has brought about a major transformation in the field of artificial intelligence (AI), data science, and machine learning. One of the most significant contributions of open AI is its generative AI capabilities that help in generating code, images, and troubleshooting bugs. These capabilities are particularly useful for data scientists who need to deploy and operationalize their machine-learning applications. 

Ground-breaking Generative AI
Ground-breaking Generative AI

Generating code from one programming language to another is one of the three main categories where generative AI applications have been seeing a lot of demand. Another popular application of generative AI is in generating images, especially for use cases such as generating images from text descriptions. 

For data scientists like the speaker, who work mostly in the AI, data science, and machine learning space, most of their work is done on the cloud. With open AI, data scientists can now access pre-trained generative AI models and customize them with their data. They can also use built-in tools to detect and mitigate any biases or unfair dynamics that may exist in their applications. 

Open AI has made accessing these tools easier through the open AI studio, where one can build AI models and deploy them faster. The speaker has found this to be a privileged situation and has been using generative AI for various communication purposes such as spot-checking, rephrasing, and creating snippets for social media posts. 

Human intelligence in conjunction with AI 

While AI has brought about a significant change in the field of content creation, the speaker warns against relying solely on AI. Human intelligence should be used in conjunction with AI to create the best results. AI is just another tool that should be used with caution, as a few wrong jumps can take you in the wrong direction. 

The other speakers in the panel discussion also shared their experiences with generative AI. One of them is writing a book that covers popular machine learning algorithms using fiction. While, until a few years back, his biggest concern was hiring graphic designers and concept artists, now, with generative AI, he can create his book’s graphics on his own. 

Generative AI’s impact on creative work  

Generative AI is impacting creative work and work in general in many ways. In creative industries, such as marketing, graphic design, animation, and content creation, generative AI is a valuable tool that allows for faster and more efficient production of high-quality content. It can also democratize access to expensive resources like models for photo shoots, making them more accessible to smaller designers. 

In other industries, such as manufacturing, healthcare, and energy, generative AI can also be used to improve efficiency and productivity. For example, it can be used to design new products, optimize manufacturing processes, and analyze medical images. 

Overall, generative AI has the potential to impact work across many different industries, and its adoption is likely to continue to grow as more businesses discover its benefits. While it may not eliminate jobs, it will likely change the nature of work in many industries, requiring workers to learn new skills to work effectively with these tools. 

Read about 12 must-have AI tools to revolutionize your work 

Francesca, emphasizes the importance of considering the ethical implications of working with AI systems, not just generative AI. She has a checklist of principles that she follows, such as fairness, reliability, safety, privacy and security, inclusiveness, accountability, and transparency, which are industry standards developed by tech companies. While principles are essential to keep in mind, Francesco also suggests that tools such as interpretML and fair AI can be leveraged to understand the impact of data on predictions and results better.  

OpenAI and generative AI have many benefits, such as improving content quality, variety, and personalization. However, to ensure these benefits follow ethical principles, the model life cycle, which starts with data collection, pre-processing, model building, and tuning model parameters and ends with prediction and interpretation, must involve humans in all stages.

Generative AI in healthcare and energy

Generative AI in Healthcare
Generative AI in Healthcare

AI in healthcare

There are many exciting ways that generative AI is being used to tackle important problems in the fields of healthcare and energy. One area where generative AI is being used in healthcare is in the creation of medical images such as X-rays and MRIs. With the help of generative AI, researchers can generate high-quality medical images that can help in the diagnosis and treatment of various medical conditions. 

It is also being used to develop new drugs and treatments. With the help of deep learning algorithms, researchers can analyze large amounts of medical data to identify new drug candidates and develop personalized treatment plans for patients. 

In the field of energy, generative AI is being used to optimize energy systems and reduce energy consumption. For example, AI models can be trained to predict energy usage patterns and adjust energy supply, accordingly, reducing waste and increasing efficiency. 

Another area where generative AI is being used is in the creation of virtual environments for training purposes. With the help of generative AI, researchers can create realistic virtual environments that can be used to train individuals in various fields such as medicine, engineering, and military training. This can help to reduce the risk of accidents and injuries during training and improve overall safety. 

Generative AI and government regulations

Overall, the role of the government in regulating the use of generative AI to create content is a highly debated topic. Some believe that the government should intervene to prevent monopolies from happening and to fund open-source projects to democratize data. Others argue that too much regulation could stifle innovation and competition.  

It is essential to strike a balance between promoting innovation and protecting consumers’ interests. Legislation and regulations could be created to define what constitutes fair use and set standards for the ethical use of AI, such as the AI bill of rights. Ultimately, governments will act following the general culture and society’s values in their region, making laws that align with what is considered acceptable. 

Closing of the session – Generative AI  

In conclusion, AI, ML, and data science have become vital to our daily lives, with advancements in these fields impacting various industries. With the continuous development of new technology, it is essential to keep up to date with the latest trends and advancements to stay competitive in the industry. The experts in the session provided valuable insights into the latest advancements and how they are using them in their everyday work. As these fields continue to evolve, it will be exciting to see what new advancements will come next. 


ChatGPT detection made easy – Top 5 free tools for identifying chatbots
Ruhma Khawaja
| February 28, 2023

Meet ChatGPT, the AI tool that has revolutionized the way people work by enabling the creation of websites, apps, and even novels. However, with its increasing popularity, bad actors have also emerged, using it to cheat on exams and generate fake content.

To help you combat this issue, we’ve compiled a list of five free AI content detectors to verify the authenticity of the content you come across.

For the unversed – What is ChatGPT?

ChatGPT is an artificial intelligence language model developed by OpenAI. It is designed to generate human-like responses to natural language inputs, making it an ideal candidate for chatbot applications. ChatGPT is trained on vast amounts of text data and is capable of understanding and responding to a wide range of topics and questions.

While ChatGPT is a powerful tool, it’s important to be able to distinguish between real and fake chatbots, which is why tools for detecting ChatGPT and other fake chatbots have become increasingly important.

Read more about ChatGPT and how this AI tool is a game changer for businesses.

Overrated or underrated – Is ChatGPT reshaping the world?

ChatGPT, as an advanced language model, is reshaping the world in a number of ways. Here are some of the ways it is making an impact: 

  • Improving customer service – ChatGPT is being used by companies to improve their customer service by creating chatbots that can provide human-like responses to customer queries. This helps to reduce response times and improve the overall customer experience. 
  • Revolutionizing language translation – It is being used to improve language translation services by creating chatbots that can translate between languages in real-time, making communication between people who speak different languages easier. 
  • Advancing healthcare – Chat GPT is being used to create chatbots that can assist healthcare professionals by providing medical advice and answering patient queries. 
  • Transforming education –  The popular AI tool is being used to create chatbots that can assist students with their studies by providing answers to questions and offering personalized feedback.

5 free tools for detecting ChatGPT 

As artificial intelligence (AI) continues to advance, the use of chatbots and virtual assistants has become increasingly common. However, with the rise of AI, there has also been an increase in the use of fake chatbots, which can be used to deceive users for fraudulent purposes. As a result, it’s important to be able to detect whether you’re interacting with a real chatbot or a fake one. In this article, we’ll look at five free tools for detecting ChatGPT.

Tools for detecting ChatGPT
                 Top Tools for detecting ChatGPT – Data Science Dojo

1. Botometer:

Botometer is a free online tool developed by the University of Southern California’s Information Sciences Institute. It uses machine learning algorithms to detect whether a Twitter account is a bot or a human. It considers a range of factors, including the frequency and timing of tweets, the language used in tweets, and the presence of certain hashtags or URLs. Botometer can also detect the likelihood that the bot is using ChatGPT or another language model.

2. Bot Sentinel:

Bot Sentinel is another free online tool that can detect and analyze Twitter accounts that exhibit bot-like behavior. It uses a variety of factors to identify accounts that are likely to be bots, such as the frequency of tweets, the similarity of tweets to other bots, and the use of certain keywords or hashtags. Bot Sentinel can also identify accounts that are likely to be using ChatGPT or other language models.

3. Botcheck.me:

Botcheck.me is a free tool that analyzes Twitter accounts to determine the likelihood that they are bots. It considers a range of factors, such as the frequency and timing of tweets, the similarity of tweets to other bots, and the presence of certain hashtags or URLs. Botcheck.me can also detect whether a bot is using ChatGPT or other language models.

4. OpenAI’s GPT-3 Detector:

OpenAI has developed a tool that can detect whether a given text was generated by their GPT-3 language model or a human. While it’s not specifically designed to detect ChatGPT, it can be useful for identifying text generated by language models. The tool uses a deep neural network to analyze the language in the text and compare it to known patterns of human language and GPT-3-generated language.

5. Hugging Face Transformers:

Hugging Face offers a free, open-source library of natural language processing tools, including several models that can detect language-based chatbots. Their “pipeline” tool can be used to quickly detect whether a given text was generated by ChatGPT or other language models. Hugging Face Transformers is used by researchers, developers, and other professionals working with natural language processing and machine learning.

Why chatbot detectors are essential for professionals?

There are several groups of people who may want chatbot detectors, including: 

  • Business owners: Business owners who rely on chatbots for customer service may want detectors to ensure that their customers are interacting with a genuine chatbot and not a fake one. This can help to protect their customers from scams or fraud. 
  • Consumers: Consumers who interact with chatbots may want detectors to protect themselves from fraudulent chatbots or phishing scams. This can help them to avoid sharing personal information with a fake chatbot. 
  • Researchers: Researchers who are studying chatbots may want detectors to help them identify which chatbots are powered by ChatGPT or other language models. This can help them to understand how language models are being used in chatbot development and how they are being integrated into different applications. 
  • Developers: Chatbot developers who are working with ChatGPT may want detectors to ensure that their chatbots are providing accurate and reliable responses to users. This can help them to build better chatbots that can provide a more satisfying user experience.

Wrapping up 

Love it or hate it – ChatGPT is here to stay. However, with the increasing use of AI in chatbots and virtual assistants, it’s important to be able to detect whether you’re interacting with a real chatbot or a fake one. These five free tools can help you detect ChatGPT and other fake chatbots, helping you to stay safe online.


12 must-have AI tools to revolutionize your daily routine
Ali Haider Shalwani
| February 18, 2023

This blog outlines a collection of 12 AI tools that can assist with day-to-day activities and make tasks more efficient and streamlined.  


Top fintech trends to look out for in 2023 
Hudaiba Soomro
| January 18, 2023

Despite major layoffs in 2022, there are many optimistic fintech trends to look out for in 2023. Every crisis bespells new opportunities. In this blog, let’s see what the future holds for fintech trends in 2023.  (more…)

ChatGPT – An insight to fun facts for all Data Scientists 
Ayesha Saleem
| January 3, 2023

In this blog, we asked ChatGPT to come up with some interesting and fun facts of the core areas related to data science, Artificial Intelligence, machine learning, and Statistics. The new AI tool surely knows the struggles of professionals and how to put it in a fun way. 


Some fun facts for Mathematicians by ChatGPT:   

Mathematicians - ChatGPT
Mathematicians’ fun facts by ChatGPT


Here are some fun facts for all statisticians to make you laugh by ChatGPT:

  • Statisticians have a sense of humor too! The probability of someone laughing at a joke you tell is directly proportional to the number of statistical jokes you know.  
  • A statistician’s favorite thing to do is to take a sample. 
  • A statistician’s favorite drink is a “mean” martini. 
  • A statistician’s favorite time of day is “mean time.” 
  • Statisticians have a tendency to “mean” well. 
  • Statisticians are always looking for a “standard deviation” from the norm. 
  • Statisticians are always looking for a “mean” to get ahead. 
  • Statisticians are never “mean” to each other – they are always “normally distributed.”
  • Mathematicians are always on the lookout for new “equations” to solve.  
  • A mathematician’s favorite thing is to “prove” a theorem.  
  • Mathematicians are known to have a “matrix” of connections in the industry. 
  • Mathematicians have been known to “calculate” the most unlikely scenarios. 
  • A mathematician’s favorite type of math is “abstract” math – it keeps things interesting! 
  • Mathematicians are always looking for ways to “optimize” their equations. 
  • Mathematicians have a saying: “The more complex the problem, the better!” 
  • Mathematicians are known to be “precise” – they never leave room for error in their calculations.  



Some fun facts for Data Scientists by ChatGPT  

ChatGPT - Data Scientists
ChatGPT fun facts for Data Scientists


Here are a few funny facts about data scientists: 

  • Data scientists have been known to “mine” for data in the most unlikely places. 
  • A data scientist’s favorite type of data is “big data” – the bigger, the better! 
  • A data scientist’s favorite tool is the “data hammer” – they can pound any data into submission. 
  • Data scientists have a saying: “The data never lies, but it can be misleading.” 
  • Data scientists have been known to “data dunk” their colleagues – throwing them into a pool of data and seeing if they can swim. 
  • Data scientists are always “data mining” for new insights and discovering “data gold.” 
  • Data scientists are known to have “data-phoria” – a state of excitement or euphoria when they uncover a particularly interesting or valuable piece of data. 
  • Data scientists have been known to “data mash” – combining different datasets to create something new and interesting. 


 Enroll in our Data Science Bootcamp course to become a Data Scientist today


Some fun facts for Machine Learning professionals by ChatGPT 

Machine learning professionals
Machine learning professionals’ fun facts by ChatGPT


Here are some fun facts about machine learning professionals   

  • Machine learning professionals are always on the lookout for new “learning opportunities.” 
  • A machine learning professional’s favorite thing is to “train” their algorithms. 
  • Machine learning professionals are known to have a “neural network” of friends in the industry. 
  • Machine learning professionals have been known to “deep learn” on the job – immersing themselves in their work and picking up new skills along the way. 
  • A machine learning professional’s favorite type of data is “clean” data – it makes their job much easier! 
  • Machine learning professionals are always looking for ways to “optimize” their algorithms. 
  • Machine learning professionals have a saying: “The more data, the merrier!” 
  • Machine learning professionals are known to be “adaptive” – they can quickly adjust to new technologies and techniques. 


Some fun facts for AI experts by ChatGPT 

AI experts - ChatGPT
ChatGPT fun fact for AI experts


Here are a few funny facts about artificial intelligence experts:   

  • AI experts are always on the lookout for new “intelligent” ideas. 
  • AI experts have been known to “teach” their algorithms to do new tasks. 
  • AI experts are known to have a “neural network” of connections in the industry. 
  • AI experts have been known to “deep learn” on the job – immersing themselves in their work and picking up new skills along the way. 
  • AI experts are always looking for ways to “optimize” their algorithms. 
  • AI experts have a saying: “The more data, the smarter the AI!” 
  • AI experts are known to be “adaptive” – they can quickly adjust to new technologies and techniques. 
  • AI experts are always looking for ways to make their algorithms more “human-like.”  
  • The term “artificial intelligence” was first coined in 1956 by computer scientist John McCarthy. 
  • The first recorded instance of artificial intelligence was in the early 1800s when mathematician Charles Babbage designed a machine that could perform basic mathematical calculations. 
  • One of the earliest demonstrations of artificial intelligence was the “Turing Test,” developed by Alan Turing in 1950. The test is a measure of a machine’s ability to exhibit intelligent behavior equivalent to, or indistinguishable from, that of a human. 
  • The first self-driving car was developed in the 1980s by researchers at Carnegie Mellon University. 
  • In 1997, a computer program called Deep Blue defeated world chess champion Garry Kasparov, marking the first time a computer had beaten a human at chess. 
  • In 2011, a machine translation system developed by Google called Google Translate was able to translate entire documents from Chinese to English with near-human accuracy. 
  • In 2016, a machine learning algorithm developed by Google DeepMind called AlphaGo defeated the world champion at the ancient Chinese board game Go, which is considered to be much more complex than chess. 
  • Artificial intelligence has the potential to revolutionize a wide range of industries, including healthcare, finance, and transportation.  


Some fun facts for Data Engineers by ChatGPT 

ChatGPT fun facts for data engineers
ChatGPT fun facts for data engineers


Here are a few funny facts about data engineers by ChatGPT: 

  • Data engineers are always on the lookout for new “pipelines” to build. 
  • A data engineer’s favorite thing is to “ingest” large amounts of data. 
  • Data engineers are known to have a “data infrastructure” of connections in the industry. 
  • Data engineers have been known to “scrape” the internet for new data sources. 
  • A data engineer’s favorite type of data is “structured” data – it makes their job much easier! 
  • Data engineers are always looking for ways to “optimize” their data pipelines. 
  • Data engineers have a saying: “The more data, the merrier!” 
  • Data engineers are known to be “adaptive” – they can quickly adjust to new technologies and techniques. 



Do you have a more interesting answer by ChatGPT?

People across the world are generating interesting responses using ChatGPT. The new AI tool has an immense contribution to the knowledge of professionals associated with different industries. Not only does it produce witty responses but also share information that is not known before by many. Share with us your use of this amazing AI tool as a Data Scientist.

Dissecting ChatGPT – Is this new AI tool a game changer?
Hudaiba Soomro
| December 14, 2022

ChatGPT is being hailed across the globe for disrupting major jobs and businesses. In this blog, we see how much of that hype is fair. 

After raging headlines like “Google is done” and “The college essay is dead”, ChatGPT is busy churning sonnets and limericks about its downtime caused due to heavy traffic. The news spreading like wildfire around town is that it will bring an end to jobs from insurance agents to court reporters. Let’s dive in and assess how much of the hype is true. 


ChatGPT – Data Science Dojo

Did ChatGPT kill the essay? 


OpenAI’s latest release large learning model (LLM), ChatGPT claims to provide natural and conversational communication. It also claims to assist with providing advice, information, performing writing and coding tasks, and admitting mistakes. Naturally, people across the globe have been bombarding the bot with requests to check how great it really is. 


Let’s consider the “death of the college essay“. The first read will show well-written essays to subjects on nearly anything. Consider, for example, the academic essay on theories of nationalism being hailed as a “solid A- essay”. However, a closer look shows that this AI tool works by using existing templates and so, college essays are churned out as per five-paragraph formulas.  


chatgpt essay
ChatGPT essay


These academic essays also lack the sophistication provided by critical thinking skills. They reproduce existing content online and refashion it to fit a specific template. In style, they are dreadfully dull, lacking stylistic human expressions.

Similarly, ChatGPT’s poetic output conveys a similar emulation of formulas being rewired, with technical obeyance of rhyme scheme, while a lack of ingenuity is evident.  


An obvious conclusion, then, appears that, while great at reorganizing text to fit templates, is deeply unaware of what it means. This comes as no surprise to those familiar with even a rudimentary understanding of natural language processing and its applications.

The function of large learning models (LLM) is far from epistemological and is rather based on identifying patterns and replicating them.  


Chatgpt Sonnet
ChatGPT sonnet



Here, it should be noted that AI tools such as it can be used as tools for humans to perform routinized, well-formulated tasks such as producing well-structured poetry or college essays. However, they lack the essential key insights provided via human intelligence regardless of the field of study. 


Is ChatGPT a source of information or misinformation? 


A feature that allows ChatGPT’s performance across a range of writing tasks is its ability to fast-fetch information. Because of its ability to fetch information immediately, it is being hailed as the end of Google. However, a few considerations regarding the differences between large learning models and search engines are important. 

For example, search engines work by hunting the web for all weblinks that are related to the search query. Their selling point here is accuracy, as they only connect you to other sources. ChatGPT, however, can provide responses to nearly any nonsensical queries.

Consider, for example, a user’s search query on designing “an electron app that is hosted on a remote server to give a desktop user notification.” As a response to this query, it came up with a completely fake method, revealing ChatGPT’s susceptibility to being a source of misinformation. 


chatgpt answer
ChatGPT answer


This tool would only admit to mistakes if prompted to do so via further inquiry, making it a rather risky tool. Opposed to this, an SEO engine would provide accurate information from original sources. This ranks the practical utility of an SEO engine far above it. This settles the debate on whether ChatGPT is to replace Google any time soon.  


Furthermore, ChatGPT’s ability to construct nonsensical ideas and arguments about nearly anything can make it unsafe for a first onlooker. Only a trained eye will then be capable of nitpicking factually plausible ideas from the mere fictional constructs. Here, again, the relevance of human ingenuity and intelligence is needed to ensure tools like this, are used in meaningful ways.  


ChatGPT’s release a signal to rethink education 


ChatGPT’s advances are, however, relevant in considering the value of human creative output rethinking conventional education and training at schools that rely on memorizing and reproducing routine tasks. For circumstances where these tasks or skills are deemed essential, it’s simple to enforce testing practices that prevent access to such sources.  


At the same time, it’s an unfair stretch to suggest that ChatGPT means the end of optimized search engines like Google and creative human tasks such as writing. At best, it can be used to assist humans in their projects, be it their daily tasks or work-related queries.

It is, at the end of the day, only a mere tool that can be integrated in a plethora of human initiatives.  


Final words 

With limitations ranging from verbosity in communication, inaccurate information, and an obvious lack of sophisticated opinions, ChatGPT’s performance doesn’t quite meet the hype. Similarly, instead of offering natural conversations, ChatGPT has offered boring and dull essays, even when it comes to imitating a writer’s style.

At the same time, it is a tool that can be used by trained experts to perform certain routine tasks including writing, coding, and information fetching more easily. 


Top 5 AI skills and AI jobs to know about before 2023
Ayesha Saleem
| November 24, 2022

Looking for AI jobs? Well, here are our top 5 AI jobs along with all the skills needed to land them

Rapid technological advances and the promotion of machine learning have shifted manual processes to automated ones. This has not only made the lives of humans easier but has also generated error-free results. To only associate AI with IT is baseless. You can find AI integrated into our day-to-day lives. From self-driven trains to robot waiters, from marketing chatbots to virtual consultants, all are examples of AI.

AI skills - AI jobs
AI Skills and AI Jobs

We can find AI everywhere without even knowing it. It is hard to explain how quickly it has become a part of our daily routine. AI will automatically find suitable searches, foods, and products even without you uttering a word. It is not hard to say that robots will replace humans very shortly.

The evolution of AI has increased the demand for AI experts. With the diversified AI job roles and emerging career opportunities, it won’t be difficult to find a suitable job matching your interests and goals. Here are the top 5 AI jobs picks that may come in handy along with the skills that will help you land them effortlessly.


Must-have skills for AI jobs

To land the AI job you need to train yourself and become an expert in multiple skills. These skills can only be mastered through great zeal of effort, hard work, and enthusiasm to learn them. Every job required its own set of core skills i.e. some may require data analysis, so others might demand expertise in machine learning. But even with the diverse job roles, the core skills needed for AI jobs remain constant which are,

  1. Expertise in a programming language (especially in Python, Scala, and Java)
  2. Hands-on knowledge of Linear Algebra and Statistics
  3. Proficient at Signal Processing Techniques
  4. Profound knowledge of the Neural Network Architects


Read blog about AI and Machine learning trends for 2023


Our top 5 picks for AI jobs


1. Machine Learning Engineer

machine learning engineer
Machine Learning engineer

Who are they?

They are responsible for discovering and designing self-driven AI systems that can run smoothly without human intervention. Their main task is to automate predictive models.

What do they do?

From designing ML systems, drafting ML algorithms, and selecting appropriate data sets they sand then analyzing large data along with testing and verifying ML algorithms.

Qualifications are required? Individuals with bachelor’s or doctoral degrees in computer science or mathematics along with proficiency in a modern programming language will most likely get this job. Knowledge about cloud applications, expertise in mathematics, computer science, machine learning, programming language, and related certifications are preferred,


2. Robotics Scientist

Robotics scientist
Robotics Scientist

Who are they? They design and develop robots that can be used to perform the error-free day-to-day task efficiently. Their services are used in space exploration, healthcare, human identification, etc.

What do they do? They design and develop robots to solve problems that can be operated with voice commands. They operate different software and understand the methodology behind it to construct mechanical prototypes. They collaborate with other field specialists to control programming software and use them accordingly.

Qualifications required? A robotics scientist must have a bachelor’s degree in robotics/ mechanical engineering/ electrical engineering or electromechanical engineering. Individuals with expertise in mathematics, AI certifications, and knowledge about CADD will be preferred.


3. Data Scientist

Data scientist
Data Scientist

Who are they? They evaluate and analyze data and extract valuable insights that assist organizations in making better decisions.

What do they do? They gather, organize and interpret a large amount of data using ML and predict analytics into much more valuable perspicuity. They use tools and data platforms like Hadoop, Spark, Hive, and programming languages especially Java, SQL, and Python to go beyond statistical analysis.

Qualification required? They must have a master’s or doctoral degree in computer sciences with hands-on knowledge of programming languages, data platforms, and cloud tools.

Master these data science tools to grow your career as Data Scientist


4. Research Scientist


Who are they? They analyze data and evaluate gathered information using restrained-based examinations.

What do they do?  Research scientists have expertise in different AI skills from ML, NLP, data processing and representation, and AI models which they use for solving problems and seeking modern solutions.

Qualifications required? Bachelor or doctoral degree in computer science or other related technical fields. Along with good communication, knowledge about AI, parallel computing, AI algorithms, and models is highly recommended for those who are thinking of pursuing this career opportunity.


5. Business Intelligence Developer


Who are they? They organize and generate the business interface and are responsible for maintaining it.

What do they do? They organize business data, extract insights from it, keep a close eye on market trends and assist organizations in achieving profitable results. They are also responsible for maintaining complex data in cloud base platforms.

Qualifications required? Bachelor’s degree in computer science, and other related technical fields with added AI certifications. Individuals with experience in data mining, SSRS, SSIS, and BI technologies and certifications in data science will be preferred.



A piece of advice for those who want to pursue AI as their career,” invest your time and money”. Take related short courses, acquire ML and AI certifications, and learn about what data science and BI technologies are all about and practices. With all these, you can become an AI expert having a growth-oriented career in no time.


Guest blog
| November 22, 2022

With the surge in demand and interest in AI and machine learning, many contemporary trends are emerging in this space. As a tech professional, this blog will excite you to see what’s next in the realm of Artificial Intelligence and Machine Learning trends.


Emerging AI and machine learning trends

Data security and regulations 

In today’s economy, data is the main commodity. To rephrase, intellectual capital is the most precious asset that businesses must safeguard. The quantity of data they manage, as well as the hazards connected with it, is only going to expand after the emergence of AI and ML. Large volumes of private information are backed up and archived by many companies nowadays, which poses a growing privacy danger. Don Evans, CEO of Crewe Foundation   


The future currency is data. In other words, it’s the most priceless resource that businesses must safeguard. The amount of data they handle, and the hazards attached to it will only grow when AI and ML are brought into the mix. Today’s businesses, for instance, back up and store enormous volumes of sensitive customer data, which is expected to increase privacy risks by 2023.

Overlap of AI and IoT 

There is a blurring of boundaries between AI and the Internet of Things. While each technology has merits of its own, only when they are combined can they offer novel possibilities? Smart voice assistants like Alexa and Siri only exist because AI and the Internet of Things have come together. Why, therefore, do these two technologies complement one another so well?

The Internet of Things (IoT) is the digital nervous system, while Artificial Intelligence (AI) is the decision-making brain. AI’s speed at analyzing large amounts of data for patterns and trends improves the intelligence of IoT devices. As of now, just 10% of commercial IoT initiatives make use of AI, but that number is expected to climb to 80% by 2023. Josh Thill, Founder of Thrive Engine 

AI ethics: Understanding biased AI and associated ethical dilemmas 
AI ethics: Understanding biased AI and associated ethical dilemmas

Why then do these two technologies complement one other so well? IoT and AI can be compared to the brain and nervous system of the digital world, respectively. IoT systems have become more sophisticated thanks to AI’s capacity to quickly extract insights from data. Software developers and embedded engineers now have another reason to include AI/ML skills in their resumes because of this development in AI and machine learning. 


Augmented Intelligence   

The growth of augmented intelligence should be a relieving trend for individuals who may still be concerned about AI stealing their jobs. It combines the greatest traits of both people and technology, offering businesses the ability to raise the productivity and effectiveness of their staff.

40% of infrastructure and operations teams in big businesses will employ AI-enhanced automation by 2023, increasing efficiency. Naturally, for best results, their staff should be knowledgeable in data science and analytics or have access to training in the newest AI and ML technologies. 

Moving on from the concept of Artificial Intelligence to Augmented Intelligence, where decisions models are blended artificial and human intelligence, where AI finds, summarizes, and collates information from across the information landscape – for example, company’s internal data sources. This information is presented to the human operator, who can make a human decision based on that information. This trend is supported by recent breakthroughs in Natural Language Processing (NLP) and Natural Language Understanding (NLU). Kuba Misiorny, CTO of Untrite Ltd


Despite being increasingly commonplace, there are trust problems with AI. Businesses will want to utilize AI systems more frequently, and they will want to do so with greater assurance. Nobody wants to put their trust in a system they don’t fully comprehend.

As a result, in 2023 there will be a stronger push for the deployment of AI in a visible and specified manner. Businesses will work to grasp how AI models and algorithms function, but AI/ML software providers will need to make complex ML solutions easier for consumers to understand.

The importance of experts who work in the trenches of programming and algorithm development will increase as transparency becomes a hot topic in the AI world. 

Composite AI 

Composite AI is a new approach that generates deeper insights from any content and data by fusing different AI technologies. Knowledge graphs are much more symbolic, explicitly modeling domain knowledge and, when combined with the statistical approach of ML, create a compelling proposition. Composite AI expands the quality and scope of AI applications and, as a result, is more accurate, faster, transparent, and understandable, and delivers better results to the user. Dorian Selz, CEO of Squirro

It’s a major advance in the evolution of AI and marrying content with context and intent allows organizations to get enormous value from the ever-increasing volume of enterprise data. Composite AI will be a major trend for 2023 and beyond. 

Continuous focus on healthcare

There has been concern that AI will eventually replace humans in the workforce ever since the concept was first proposed in the 1950s. Throughout 2018, a deep learning algorithm was constructed that demonstrated accurate diagnosis utilizing a dataset consisting of more than 50,000 normal chest pictures and 7,000 scans that revealed active Tuberculosis. Since then, I believe that the healthcare business has mostly made use of Machine Learning (ML) and Deep Learning applications of artificial intelligence. Marie Ysais, Founder of Ysais Digital Marketing

Learn more about the role of AI in healthcare:

AI in healthcare has improved patient care


Pathology-assisted diagnosis, intelligent imaging, medical robotics, and the analysis of patient information are just a few of the many applications of artificial intelligence in the healthcare industry. Leading stakeholders in the healthcare industry have been presented with advancements and machine-learning models from some of the world’s largest technology companies. Next year, 2023, will be an important year to observe developments in the field of artificial intelligence.

Algorithmic decision-making 

Advanced algorithms are taking on the skills of human doctors, and while AI may increase productivity in the medical world, nothing can take the place of actual doctors. Even in robotic surgery, the whole procedure is physician-guided. AI is a good supplement to physician-led health care. The future of medicine will be high-tech with a human touch.  


No-code tools   

The low-code/No Code ML revolution accelerates creating a new breed of Citizen AI. These tools fuel mainstream ML adoption in businesses that were previously left out of the first ML wave (mostly taken advantage of by BigTech and other large institutions with even larger resources). Maya Mikhailov Founder of Savvi AI 

Low-code intelligent automation platforms allow business users to build sophisticated solutions that automate tasks, orchestrate workflows, and automate decisions. They offer easy-to-use, intuitive drag-and-drop interfaces, all without the need to write a line of code. As a result, low-code intelligent automation platforms are popular with tech-savvy business users, who no longer need to rely on professional programmers to design their business solutions. 


Cognitive analytics 

Cognitive analytics is another emerging trend that will continue to grow in popularity over the next few years. The ability for computers to analyze data in a way that humans can understand is something that has been around for a while now but is only recently becoming available in applications such as Google Analytics or Siri—and it’ll only get better from here! 


Virtual assistants 

Virtual assistants are another area where NLP is being used to enable more natural human-computer interaction. Virtual assistants like Amazon Alexa and Google Assistant are becoming increasingly common in homes and businesses. In 2023, we can expect to see them become even more widespread as they evolve and improve. Idrees Shafiq-Marketing Research Analyst at Astrill

virtual reality

Virtual assistants are becoming increasingly popular, thanks to their convenience and ability to provide personalized assistance. In 2023, we can expect to see even more people using virtual assistants, as they become more sophisticated and can handle a wider range of tasks. Additionally, we can expect to see businesses increasingly using virtual assistants for customer service, sales, and marketing tasks.

Information security (InfoSec)

The methods and devices used by companies to safeguard information fall under the category of information security. It comprises settings for policies that are essentially designed to stop the act of stopping unlawful access to, use of, disclosure of, disruption of, modification of, an inspection of, recording of, or data destruction.

With AI models that cover a broad range of sectors, from network and security architecture to testing and auditing, AI prediction claims that it is a developing and expanding field. To safeguard sensitive data from potential cyberattacks, information security procedures are constructed on the three fundamental goals of confidentiality, integrity, and availability, or the CIA. Daniel Foley, Founder of Daniel Foley SEO 


Wearable devices 

The continued growth of the wearable market. Wearable devices, such as fitness trackers and smartwatches, are becoming more popular as they become more affordable and functional. These devices collect data that can be used by AI applications to provide insights into user behavior. Oberon, Founder, and CEO of Very Informed 


Process discovery

It can be characterized as a combination of tools and methods with heavy reliance on artificial intelligence (AI) and machine learning to assess the performance of persons participating in the business process. In comparison to prior versions of process mining, these goes further in figuring out what occurs when individuals interact in different ways with various objects to produce business process events.

The methodologies and AI models vary widely, from clicks of the mouse for specific reasons to opening files, papers, web pages, and so forth. All of this necessitates various information transformation techniques. The automated procedure using AI models is intended to increase the effectiveness of commercial procedures. Salim Benadel, Director at Storm Internet


Robotic Process Automation, or RPA. 

An emerging tech trend that will start becoming more popular is Robotic Process Automation or RPA. It is like AI and machine learning, and it is used for specific types of job automation. Right now, it is primarily used for things like data handling, dealing with transactions, processing/interpreting job applications, and automated email responses. It makes many businesses processes much faster and more efficient, and as time goes on, increased processes will be taken over by RPA. Maria Britton, CEO of Trade Show Labs 

Robotic process automation is an application of artificial intelligence that configures a robot (software application) to interpret, communicate and analyze data. This form of artificial intelligence helps to automate partially or fully manual operations that are repetitive and rule based. Percy Grunwald, Co-Founder of Hosting Data 


Generative AI 

Most individuals say AI is good for automating normal, repetitive work. AI technologies and applications are being developed to replicate creativity, one of the most distinctive human skills. Generative AI algorithms leverage existing data (video, photos, sounds, or computer code) to create new, non-digital material.

Deepfake films and the Metaphysic act on America’s Got Talent have popularized the technology. In 2023, organizations will increasingly employ it to manufacture fake data. Synthetic audio and video data can eliminate the need to record film and speech on video. Simply write what you want the audience to see and hear, and the AI creates it. Leonidas Sfyris 

With the rise of personalization in video games, new content has become increasingly important. Companies are not able to hire enough artists to constantly create new themes for all the different characters so the ability to put in a concept like a cowboy and then the art assets created for all their characters becomes a powerful tool. 


Observability in practice

By delving deeply into contemporary networked systems, Applied Observability facilitates the discovery and resolution of issues more quickly and automatically. Applied observability is a method for keeping tabs on the health of a sophisticated structure by collecting and analyzing data in real time to identify and fix problems as soon as they arise.

Utilize observability for application monitoring and debugging. Telemetry data including logs, metrics, traces, and dependencies are collected by Observability. The data is then correlated in actuality to provide responders with full context for the incidents they’re called to. Automation, machine learning, and artificial intelligence (AIOps) might be used to eliminate the need for human interaction in problem-solving. Jason Wise, Chief Editor at Earthweb 


Natural Language Processing 

As more and more business processes are conducted through digital channels, including social media, e-commerce, customer service, and chatbots, NLP will become increasingly important for understanding user intent and producing the appropriate response.

Read more about NLP tasks and techniques in this blog:

Natural Language Processing – Tasks and techniques


In 2023, we can expect to see increased use of Natural Language Processing (NLP) for communication and data analysis. NLP has already seen widespread adoption in customer service chatbots, but it may also be utilized for data analysis, such as extracting information from unstructured texts or analyzing sentiment in large sets of customer reviews. Additionally, deep learning algorithms have already shown great promise in areas such as image recognition and autonomous vehicles.

In the coming years, we can expect to see these algorithms applied to various industries such as healthcare for medical imaging analysis and finance for stock market prediction. Lastly, the integration of AI tools into various industries will continue to bring about both exciting opportunities and ethical considerations. Nicole Pav, AI Expert.  


 Do you know any other AI and Machine Learning trends

Share with us in comments if you know about any other trending or upcoming AI and machine learning.


Top 10 trending podcasts of AI (Artificial Intelligence) and ML (Machine Learning)
Ayesha Saleem
| November 14, 2022

What can be a better way to spend your days listening to interesting bits about trending AI and Machine learning topics? Here’s a list of the 10 best AI and ML podcasts.

Top 10 AI and ML podcasts
Top 10 Trending AI (Artificial Intelligence) and ML (Machine Learning) podcasts 


1. The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

Artificial intelligence and machine learning are fundamentally altering how organizations run and how individuals live. It is important to discuss the latest innovations in these fields to gain the most benefit from technology. The TWIML AI Podcast outreaches a large and significant audience of ML/AI academics, data scientists, engineers, tech-savvy business, and IT (Information Technology) leaders, as well as the best minds and gather the best concepts from the area of ML and AI.  

The podcast is hosted by a renowned industry analyst, speaker, commentator, and thought leader Sam Charrington. Artificial intelligence, deep learning, natural language processing, neural networks, analytics, computer science, data science, and other technologies are discussed. 


2. The AI Podcast

One individual, one interview, one account. This podcast examines the effects of AI on our world. The AI podcast creates a real-time oral history of AI that has amassed 3.4 million listens and has been hailed as one of the best AI and machine learning podcasts. They always bring you a new story and a new 25-minute interview every two weeks. Consequently, regardless of the difficulties, you are facing in marketing, mathematics, astrophysics, paleo history, or simply trying to discover an automated way to sort out your kid’s growing Lego pile, listen in and get inspired. 


3. Data Skeptic

Data Skeptic launched as a podcast in 2014. Hundreds of interviews and tens of millions of downloads later, we are a widely recognized authoritative source on data science, artificial intelligence, machine learning, and related topics. 

Data Skeptic runs in seasons. By speaking with active scholars and business leaders who are somehow involved in our season’s subject, we probe it. 

We carefully choose each of our visitors using a system internally. Since we do not cooperate with PR firms, we are unable to reply to the daily stream of unsolicited submissions. Publishing quality research to the arxiv is the greatest approach to getting on the show. It is crawled. We will locate you. 

Data Skeptic is a boutique consulting company in addition to its podcast. Kyle participates directly in each project our team undertakes. Our work primarily focuses on end-to-end machine learning, cloud infrastructure, and algorithmic design. 

The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches. 


Pro-tip: Enroll in the data science boot camp today to learn the basics of the industry





Artificial intelligence and machine learning podcast
Artificial Intelligence and Machine Learning podcast

4. Podcast.ai 

Podcast.ai is entirely generated by artificial intelligence. Every week, they explore a new topic in-depth, and listeners can suggest topics or even guests and hosts for future episodes. Whether you are a machine learning enthusiast, just want to hear your favorite topics covered in a new way or even just want to listen to voices from the past brought back to life, this is the podcast for you.

The podcast aims to put incremental advances into a broader context and consider the global implications of developing technology. AI is about to change your world, so pay attention. 


5. The Talking Machines

Talking machines is a podcast hosted by Katherine Gorman and Neil Lawrence. The objective of this show is to bring you clear conversations with experts in the field of machine learning, insightful discussions of industry news, and useful answers to your questions. Machine learning is changing the questions we can ask of the world around us, here we explore how to ask the best questions and what to do with the answers. 


6. Linear Digressions

If you are interested in learning about unusual applications of machine learning and data science. In each episode of linear digressions, your hosts explore machine learning and data science through interesting apps. Ben Jaffe and Katie Malone host the show, they assure themselves to produce the most exciting additions in the industry such as AI-driven medical assistants, open policing data, causal trees, the grammar of graphics and a lot more.  


7. Practical AI: Machine Learning, Data Science

Making artificial intelligence practical, productive, and accessible to everyone. Practical AI is a show in which technology professionals, businesspeople, students, enthusiasts, and expert guests engage in lively discussions about Artificial Intelligence and related topics (Machine Learning, Deep Learning, Neural Networks, GANs (Generative adversarial networks), MLOps (machine learning operations) (machine learning operations), AIOps, and more).

The focus is on productive implementations and real-world scenarios that are accessible to everyone. If you want to keep up with the latest advances in AI, while keeping one foot in the real world, then this is the show for you! 


8. Data Stories

Enrico Bertini and Moritz Stefaner discuss the latest developments in data analytics, visualization, and related topics. The data stories podcast consists of regular new episodes on a range of discussion topics related to data visualization. It shares the importance of data stories in different fields including statistics, finance, medicine, computer science, and a lot more to name. The podcast’s hosts Enrico and Moritz invite industry leaders, experienced professionals, and instructors in data visualization to share the stories and the importance of representation of data visuals into appealing charts and graphs. 


9. The Artificial Intelligence Podcast

The Artificial intelligence podcast is hosted by Dr. Tony Hoang. This podcast talks about the latest innovations in the artificial intelligence and machine learning industry. The recent episode of the podcast discusses text-to-image generator, Robot dog, soft robotics, voice bot options, and a lot more.  


10. Learning Machines 101

Smart machines employing artificial intelligence and machine learning are prevalent in everyday life. The objective of this podcast series is to inform students and instructors about the advanced technologies introduced by AI and the following: 

  •  How do these devices work? 
  • Where do they come from? 
  • How can we make them even smarter? 
  • And how can we make them even more human-like? 


Have we missed any of your favorite podcasts?

 Do not forget to share in comments the names of your most favorite AI and ML podcasts. Read this amazing blog if you want to know about Data Science podcasts.

Data Science vs AI – What 2023 demand for?
Lafond Wanda
| November 10, 2022

Most people have heard the terms “data science” and “AI” at least once in their lives. Indeed, both of these are extremely important in the modern world as they are technologies that help us run quite a few of our industries. 

But even though data science and Artificial Intelligence are somewhat related to one another, they are still very different. There are things they have in common which is why they are often used together, but it is crucial to understand their differences as well. 

What is Data Science? 

As the name suggests, data science is a field that involves studying and processing data in big quantities using a variety of technologies and techniques to detect patterns, make conclusions about the data, and help in the decision-making process. Essentially, it is an intersection of statistics and computer science largely used in business and different industries. 

Artificial Intelligence (AI) vs Data science vs Machine learning
Artificial Intelligence vs Data science vs Machine learning – Image source

The standard data science lifecycle includes capturing data and then maintaining, processing, and analyzing it before finally communicating conclusions about it through reporting. This makes data science extremely important for analysis, prediction, decision-making, problem-solving, and many other purposes. 

What is Artificial Intelligence? 

Artificial Intelligence is the field that involves the simulation of human intelligence and the processes within it by machines and computer systems. Today, it is used in a wide variety of industries and allows our society to function as it currently does by using different AI-based technologies. 

Some of the most common examples in action include machine learning, speech recognition, and search engine algorithms. While AI technologies are rapidly developing, there is still a lot of room for their growth and improvement. For instance, there is no powerful enough content generation tool that can write texts that are as good as those written by humans. Therefore, it is always preferred to hire an experienced writer to maintain the quality of work.  

What is Machine Learning? 

As mentioned above, machine learning is a type of AI-based technology that uses data to “learn” and improve specific tasks that a machine or system is programmed to perform. Though machine learning is seen as a part of the greater field of AI, its use of data puts it firmly at the intersection of data science and AI. 

Similarities between Data Science and AI 

By far the most important point of connection between data science and Artificial Intelligence is data. Without data, neither of the two fields would exist and the technologies within them would not be used so widely in all kinds of industries. In many cases, data scientists and AI specialists work together to create new technologies or improve old ones and find better ways to handle data. 

As explained earlier, there is a lot of room for improvement when it comes to AI technologies. The same can be somewhat said about data science. That’s one of the reasons businesses still hire professionals to accomplish certain tasks like custom writing requirements, design requirements, and other administrative work.  

Differences between Data Science and AI 

There are quite a few differences between both. These include: 

  • Purpose – It aims to analyze data to make conclusions, predictions, and decisions. Artificial Intelligence aims to enable computers and programs to perform complex processes in a similar way to how humans do. 
  • Scope – This includes a variety of data-related operations such as data mining, cleansing, reporting, etc. It primarily focuses on machine learning, but there are other technologies involved too such as robotics, neural networks, etc. 
  • Application – Both are used in almost every aspect of our lives, but while data science is predominantly present in business, marketing, and advertising, AI is used in automation, transport, manufacturing, and healthcare. 

Examples of Data Science and Artificial Intelligence in use 

To give you an even better idea of what data science and Artificial Intelligence are used for, here are some of the most interesting examples of their application in practice: 

  • Analytics – Analyze customers to better understand the target audience and offer the kind of product or service that the audience is looking for. 
  • Monitoring – Monitor the social media activity of specific types of users and analyze their behavior. 
  • PredictionAnalyze the market and predict demand for specific products or services in the nearest future. 
  • Recommendation – Recommend products and services to customers based on their customer profiles, buying behavior, etc. 
  • Forecasting – Predict the weather based on a variety of factors and then use these predictions for better decision-making in the agricultural sector. 
  • Communication – Provide high-quality customer service and support with the help of chatbots. 
  • Automation – Automate processes in all kinds of industries from retail and manufacturing to email marketing and pop-up on-site optimization. 
  • Diagnosing – Identify and predict diseases, give correct diagnoses, and personalize healthcare recommendations. 
  • Transportation – Use self-driving cars to get where you need to go. Use self-navigating maps to travel. 
  • Assistance – Get assistance from smart voice assistants that can schedule appointments, search for information online, make calls, play music, and more. 
  • Filtering – Identify spam emails and automatically get them filtered into the spam folder. 
  • Cleaning – Get your home cleaned by a smart vacuum cleaner that moves around on its own and cleans the floor for you. 
  • Editing – Check texts for plagiarism and proofread and edit them by detecting grammatical, spelling, punctuation, and other linguistic mistakes. 

It is not always easy to tell which of these examples is about data science and which one is about Artificial Intelligence because many of these applications use both of them. This way, it becomes even clearer just how much overlap there is between these two fields and the technologies that come from them. 

What is your choice?

At the end of the day, data science and AI remain some of the most important technologies in our society and will likely help us invent more things and progress further. As a regular citizen, understanding the similarities and differences between the two will help you better understand how data science and Artificial Intelligence are used in almost all spheres of our lives. 

Saving lives behind the wheel: Artificial Intelligence and Computer Vision for road safety 
Data Science Dojo Staff
| October 31, 2022

In this blog, we will discuss how Artificial Intelligence and computer vision are contributing to improving road safety for people. 

Each year, about 1.35 million people are killed in crashes on the world’s roads, and as many as 50 million others are seriously injured, according to the World Health Organization. With the increase in population and access to motor vehicles over the years, rising traffic and its harsh effects on the streets can be vividly observed with the growing number of fatalities.

We call this suffering traffic “accidents” — but, in reality, they can be prevented. Governments all over the world are resolving to reduce them with the help of artificial intelligence and computer vision.  


saving lives behind wheels - AI and road safety
Artificial intelligence and computer vision for road safety

Humans make mistakes, as it is in their nature to do so, but when small mistakes can lead to huge losses in the form of traffic accidents, necessary changes are to be made in the design of the system.

A technology deep-dive into this problem will show how a lack of technological innovations has failed to lower this trend over the past 20 years. However, with the adoption of the ‘Vision Zero’ program by governments worldwide, we may finally see a shift in this unfortunate trend.  

 Role of Artificial Intelligence for improving road traffic

AI can improve road traffic by reducing human error, speeding up the process of detection and response to accidents, as well as improving safety. With the advancement of computer vision, the quality of data and predictions made with video analytics has increased ten-folds.  


Artificial Intelligence is already leveraging the power of vision analytics in scenarios like identifying mobile phone usage by the driver on highways and recognize human errors much faster. But what lies ahead to be used in our everyday life? Will progress be fast enough to tackle the complexities self-driving cars bring with them? 


In recent studies, it’s been inferred through data that subtle distractions on a busy road are correlated to the traffic accidents there. Experts believe that in order to minimize the risk of an accident, the system must be planned with the help of architects, engineers, transport authorities, city planners and AI.  

With the help of AI, it becomes easier to identify the problems at hand, however they will not solve them on their own. Designing the streets in a way that can eliminate certain factors of accidents could be the essential step to overcome the situation at hand.  

AI also has a potential to help increase efficiency during peak hours by optimizing traffic flow. Road traffic management has undergone a fundamental shift because of the quick development of artificial intelligence (AI). With increasing accuracy, AI is now able to predict and manage the movement of people, vehicles, and goods at various locations along the transportation network.  

As we make advancements into the field, simple AI programs along with machine learning and data science, are enabling better service for citizens than ever before while also reducing accidents by streamlining traffic at intersections and enhancing safety during times when roads are closed due to construction or other events.  

Deep learning impact on improved infrastructure for road safety

Deep learning system’s capacity for processing, analyzing, and making quick decisions from enormous amounts of data has also facilitated the development of efficient mass transit systems like ride-sharing services. With the advent of cloud-edge devices, the process of gathering and analyzing data has become much more efficient.

Increase in the number of different sources of data collection has led to an increase of not only quality but quantity of variety of data as well. These systems leverage the data from real-time edge devices and can tackle them effectively by retrofitting existing camera infrastructure for road safety. 

 Join our upcoming webinar

In our upcoming webinar on 29th November, we will summarize the challenges in the industry and how AI plays its part in making a safe environment by solutions catering to avoiding human errors.  






  1. https://www.nytimes.com/2022/04/19/technology/ai-road-car-safety.html 
  1. https://www.clickworker.com/customer-blog/artificial-intelligence-road-traffic/ 


AI powered document search
Tyler Hutcherson
| October 7, 2022

Applications leveraging AI powered search are on the rise. My colleague, Sam Partee, recently introduced vector similarity search (VSS) in Redis and how it can be applied to common use cases. As he puts it:


“Users have come to expect that nearly every application and website provide some type of search functionality. With effective search becoming ever-increasingly relevant (pun intended), finding new methods and architectures to improve search results is critical for architects and developers. “

–  Sam Partee: Vector Similarity Search: from Basics to Production


For example, in eCommerce, allowing shoppers to browse product inventory with a visual similarity component brings online shopping one step closer to mirroring an in-person experience. 


However, this is only the tip of the iceberg. Here, we will pick up right where Sam left off with another common use case for vector similarity: Document Search.


We will cover:

  • Common applications of AI-powered document search
  • A typical production workflow
  • A hosted example using the arXiv papers dataset
  • Scaling embedding workflows


Lastly, we will share about an exciting upcoming hackathon co-hosted by Redis, MLOps Community, and Saturn Cloud from October 24 – November 4 that you can join in the coming weeks!


AI hackathon co-hosted by Redis

The use case

Whether we realize it or not, we take advantage of document search and processing capabilities in everyday life. We see its impact while searching for a long-lost text message in our phone, automatically filtering spam from our email inbox, and performing basic Google searches.

Businesses use it for information retrieval (e.g. insurance claims, legal documents, financial records), and even generating content-based recommendations (e.g. articles, tweets, posts). 

Beyond lexical search

Traditional search, i.e. lexical search, emphasizes the intersection of common keywords between docs. However, a search query and document may be very similar to one another in meaning and not share any of the same keywords (or vice versa). For example, in the sentences below, all readers should be able to parse that they are communicating the same thing. But – only two words overlap.


The weather looks dark and stormy outside.” <> “The sky is threatening thunder and lightning.”


Another example…with pure lexical search, “USA” and “United States” would not trigger a match though these are interchangeable terms.

This is where lexical search breaks down on its own. 

Neural search

Search has evolved from simply finding documents to providing answers. Advances in NLP and large language models (GPT-3, BERT, etc) have made it incredibly easy to overcome this lexical gap AND expose semantic properties of text. Sentence embeddings form a condensed vector-like representation of unstructured data that encodes “meaning”.


Neural search - Sentence embeddings
Sentence embeddings – Data Science Dojo


These embeddings allow us to compute similarity metrics (e.g. cosine similarity, euclidean distance, and inner product) to find similar documents, i.e. neural (or vector) search.  Neural search respects word order and understands the broader context beyond the explicit terms used.


Immediately this opens up a host of powerful use cases

  • Question & Answering Services
  • Intelligent Document Search + Retrieval
  • Insurance Claim Fraud Detection


Hugging face transformer
Hugging face transformer


What’s even better is that ready-made models from Hugging Face Transformers can fast-track text-to-embedding transformations. Though, it’s worth noting that many use cases require fine-tuning to ensure quality results: 


Production workflow

In a production software environment, document search must take advantage of a low-latency database that persists all docs and manages a search index that can enable nearest neighbors vector similarity operations between documents.

RediSearch was introduced as a module to extend this functionality over a Redis cluster that is likely already handling web request caching or online ML feature serving (for low-latency model inference).


Below we will highlight the core components of a typical production workflow.

AI powered - Typical production flow
Document processing production workflow


Document processing

In this phase, documents must be gathered, embedded, and stored in the vector database. This process happens up front before any client tries to search and will also consistently run in the background on document updates, deletions, and insertions.

Up front, this might be iteratively done in batches from some data warehouse. Also, it’s common to leverage streaming data structures (e.g., Kafka, Kinesis, or Redis Streams) to orchestrate the pipeline in real time.

Scalable document processing services might take advantage of a high-throughput inference server like NVIDIA’s Triton. Triton enables teams to deploy, run, and scale trained AI models from any standard backend on GPU (or CPU) hardware.

Depending on the source, volume, and variety of data, a number of pre-processing steps will also need to be included in the pipeline (including embedding models to create vectors from text).


After a client enters a query along with some optional filters (e.g. year, category), the query text is converted into an embedding projected into the same vector space as the pre-processed documents. This allows for discovery of the most relevant documents from the entire corpus.

With the right vector database solution, these searches could be performed over hundreds of millions of documents in 100ms or less.

We recently put this into action and built redis-arXiv-search on top of the arXiv dataset (provided by Kaggle) as a live demo. Under the hood, we’re using Redis Vector Similarity Search, a Dockerized Python FastAPI, and a React Typescript single page app (SPA).

Paper abstracts were converted into embeddings and stored in RediSearch. With this app, we show how you can search over these papers with natural language.


Let’s try an example: machine learning helps me get healthier”. When you enter this query, the text is sent to a Python server that converts the text to an embedding and performs a vector search. 

Vector search capabilities of Redis
arXiv document search example


As you can see, the top four results are all related to health outcomes and policy. If you try to confuse it with something even more complex like: “jay z and beyonce”, the top results are as follows:

  1. Elites, communities and the limited benefits of mentorship in electronic music
  2. Can Celebrities Burst Your Bubble?
  3. Forbidden triads and Creative Success in Jazz: The Miles Davis Factor
  4. Popularity and Centrality in Spotify Networks: Critical transitions in eigenvector centrality


We are pretty certain that the names of these two icons don’t show up verbatim in the paper abstracts… Because of the semantic properties encoded in the sentence embeddings, this application is able to associate “Jay Z” and “Beyonce” with topics like Music, Celebrities, and Spotify. 

Scaling embedding workflows

That was the happy path. Realistically, most production-grade document retrieval systems rely on hundreds of millions or even billions of docs. It’s the price to pay for a system that can actually solve real-world problems over unstructured data.

Beyond scaling the embedded workflows, you’ll also need to have a database with enough horsepower to build the search index in a timely fashion. 

GPU acceleration

In 2022, giving out free computers is the best way to make friends with anybody. Thankfully, our friends at Saturn Cloud have partnered with us to share access to GPU hardware.

They have a solid free tier that gives us access to an NVIDIA T4 with the ability to upgrade for a fee. Recently, Google Colab also announced a new pricing structure, a “Pay As You Go” format, which allows users to have flexibility in exhausting their compute quota over time.


These are both great options when running workloads on your CPU bound laptop or instance won’t cut it. 


What’s even better is that Hugging Face Transformers can take advantage of GPU acceleration out-of-the-box. This can speed up ad-hoc embedding workflows quite a bit. However, for production use cases with massive amounts of data, a single GPU may not cut it. 

Multi-GPU with Dask and cuDF

What if data will not fit into RAM of a single GPU instance, and you need the boost? There are many ways a data engineer might address this issue, but here I will focus on one particular approach leveraging Dask and cuDF.


The RAPIDS team at NVIDIA is dedicated to building open-source tools for executing data science and analytics on GPUs. All of the Python libraries have a comfortable feel to them, empowering engineers to take advantage of powerful hardware under the surface. 


Scaling out workloads on multiple GPUs w/ RAPIDS tooling involves leveraging multi-node Dask clusters and cuDF data frames. Most Pythonista’s are familiar with the popular Pandas data frame library. cuDF, built on Apache Arrow, provides an interface very similar to Pandas, running on a GPU, all without having to know the ins and outs of CUDA development.


Workflow - cuDF data frame of arXiv papers
Workflow – Dask cuDF processing arXiv papers


In the above workflow, a cuDF data frame of arXiv papers was loaded and partitions were created across a 3 node Dask cluster (with each worker node as an NVIDIA T4). In parallel, a user-defined function was applied to each data frame partition that processed and embedded the text using a Sentence Transformer model.


This approach provided linear scalability with the number of nodes in the Dask cluster. With 3 worker nodes, the total runtime decreased by a factor of 3. 


Even with multi-GPU acceleration, data is mapped to and from machines. It’s heavily dependent on RAM, especially after the large embedding vectors have been created.

A few variations to consider:

  • Load and process iterative batches of documents from a source database.
  • Programmatically load partitions of data from a source database to several Dask workers for parallel execution.
  • Perform streaming updates from the Dask workers directly to the vector database rather than loading embeddings back to single GPU RAM.

Call to action – it’s YOUR turn!

Inspired by the initial work on the arXiv search demo, Redis is officially launching a Vector Search Engineering Lab (Hackathon) co-sponsored by MLOps Community and Saturn Cloud. Read more about it here.


Vector search
Vector search


This is the future. Vector search & document retrieval is now more accessible than ever before thanks to open-source tools like Redis, RAPIDS, Hugging Face, Pytorch, Kaggle, and more! Take the opportunity to get ahead of the curve and join in on the action. We’ve made it super simple to get started and acquire (or sharpen) an emerging set of skills.

In the end, you will get to showcase what you’ve built and win $$ prizes.

The hackathon will run from October 24 – November 4 and include folks across the globe, professionals and students alike. Register your team (up to 4 people) today! You don’t want to miss it.

Ayesha Saleem
| October 4, 2022

The use of AI in culture raises interesting ethical reflections termed as AI ethics nowadays.  

In 2016, a Rembrandt painting, “The Next Rembrandt”, was designed by a computer and created by a 3D printer, 351 years after the painter’s death.  

The achievement of this artistic prowess becomes possible when 346 Rembrandt paintings were together analyzed. The keen analysis of paintings pixel by pixel resulted in an upscale of deep learning algorithms to create a unique database.  

AI ethics - Rembrandt painting
Ethical dilemma of AI- Rembrandt painting

Every detail of Rembrandt’s artistic identity could then be captured and set the foundation for an algorithm capable of creating an unprecedented masterpiece. To bring the painting to life, a 3D printer recreated the texture of brushstrokes and layers of paint on the canvas for a breath-taking result that could trick any art expert. 

The ethical dilemma arose when it came to crediting the author of the painting. Who could it be?  

We cannot overlook the transformations brought by intelligent machine systems in today’s world for the better. To name a few, artificial intelligence contributed to optimizing planning, detecting fraud, composing art, conducting research, and providing translations. 

Undoubtedly, it all contributed to the more efficient and consequently richer world of today. Leading global tech companies emphasize adopting boundless landscape of artificial intelligence and step ahead of the competitive market.  

Amidst the boom of overwhelming technological revolutions, we cannot undermine the new frontier for ethics and risk assessment.  

Regardless of the risks AI offers, there are many real-world problems that are begging to be solved by data scientists. Check out this informative session by Raja Iqbal (Founder and lead instructor at Data Science Dojo) on AI For Social Good 

Some of the key ethical issues in AI you must learn about are: 

1. Privacy & surveillance – Is your sensitive information secured?

Access to personal identifiable information must only be accessible for the authorized users only. The other key aspects of privacy to consider in artificial intelligence are information privacy, privacy as an aspect of personhood, control over information about oneself, and the right to secrecy. 

Business today is going digital. We are associated with the digital sphere. Most digital data available online connects to a single Internet. There is increasingly more sensor technology in use that generates data about non-digital aspects of our lives. AI not only contributes to data collection but also drives possibilities for data analysis.  

Privacy and surveillance - AI ethics
Fingerprint scan, Privacy and surveillance – Data Science Dojo

Much of the most privacy-sensitive data analysis today–such as search algorithms, recommendation engines, and AdTech networks–are driven by machine learning and decisions by algorithms. However, as artificial intelligence evolves, it defines ways to intrude privacy interests of users.

For instance, facial recognition introduces privacy issues with the increased use of digital photographs. Machine recognition of faces has progressed rapidly from fuzzy images to rapid recognition of individual humans.  

2. Manipulation of behavior – How does the internet know our preferences?

Usage of internet and online activities keep us engaged every day. We do not realize that our data is constantly collected, and information is tracked. Our personal data is used to manipulate our behavior online and offline as well.  

If you are thinking about exactly when businesses make use of the information gathered and how they manipulate us, then marketers and advertisers are the best examples. To sell the right product to the right customer, it is significant to know the behavior of your customer.

Their interests, past purchase history, location, and other key demographics. Therefore, advertisers retrieve the personal information of potential customers that is available online. 

AI Ethics - User behaviour
Behavior manipulation- AI ethics, Data Science Dojo

Social media has become the hub of manipulating user behaviors by marketers to maximize profits. AI with its advanced social media algorithms identifies vulnerabilities in human behavior and influences our decision-making process. 

 Artificial intelligence integrates such algorithms with digital media that exploit human biases detected by AI algorithms. It implies personalized addictive strategies for consumption of (online) goods or benefits from the vulnerable state of individuals to promote products and services that match well with their temporary emotions. 

3. Opacity of AI systems – Complexed AI processes

Danaher stated, “we are creating decision-making processes that constrain and limit opportunities for human participation” 

Artificial Intelligence supports automated decision-making, thus neglecting the free will of personnel to speak of their choice. AI processes work in a way that no one knows how the output is generated. Therefore, the decision will remain opaque even for the experts  

AI systems use machine learning techniques in neural networks to retrieve patterns from a given dataset. With or without “correct” solutions provided, i.e., supervised, semi-supervised or unsupervised.


Read this blog to learn more about AI powered document search


Machine learning captures existing patterns in the data with the help of these techniques. And then label these patterns in such a way that it gets useful for the decision the system makes, while the programmer does not really know which patterns in the data the system has used. 

4. Human-robot interaction – Are robots more capable than us?

As AI is now widely used to manipulate human behavior, it is also actively driving robots. It can get problematic if their processes or appearance involve deception or threatening human dignity 

The key ethical issue here is, “Should robots be programmed to deceive us?” If we answer this question with a yes, then the next question to ask is “what should be the limits of deception?” If we say that robots can deceive us if it does not seriously harm us, then the robot might lie about its abilities or pretend to have more knowledge than it has.  

human robot - AI ethics
Human robot interaction- Data Science Dojo

If we believe that robots should not be programmed to deceive humans, then the next ethical question becomes “should robots be programmed to lie at all?” The answer would depend on what kind of information they are giving and whether humans are able to provide an alternative source.  

Robots are now being deployed in the workplace to do jobs that are dangerous, difficult, or dirty. The automation of jobs is inevitable in the future, and it can be seen as a benefit to society or a problem that needs to be solved. The problem arises when we start talking about human robot interaction and how robots should behave around humans in the workplace. 

5. Autonomous systems – AI gaining self-sufficiency

An autonomous system can be defined as a self-governing or self-acting entity that operates without external control. It can also be defined as a system that can make its own decisions based on its programming and environment. 

The next step in understanding the ethical implications of AI is to analyze how it affects society, humans, and our economy. This will allow us to predict the future of AI and what kind of impact it will have on society if left unchecked. 

In societies where AI is rapidly replacing humans can get harmed or suffer in the longer run. For instance, thinking of AI writers as a replacement for human copywriters when it is just designed to bring efficiency to a writer’s job, provide assistance, and help in getting rid of writer’s block while generating content ideas at scale.  

Secondly, autonomous vehicles are the most relevant examples for a heated debate topic of ethical issues in AI. It is not yet clear what the future of autonomous vehicles will be. The main ethical concern around autonomous cars is that they could cause accidents and fatalities. 

Some people believe that because these cars are programmed to be safe, they should be given priority on the road. Others think that these vehicles should have the same rules as human drivers. 

Enroll in Data Science Bootcamp today to learn advanced technological revolutions 

6. Machine ethics – Can we infuse good behavior in machines?

Before we get into the ethical issues associated with machines, we need to know that machine ethics is not about humans using machines. But it is solely related to the machines operating independently as subjects. 

The topic of machine ethics is a broad and complex one that includes a few areas of inquiry. It touches on the nature of what it means for something to be intelligent, the capacity for artificial intelligence to perform tasks that would otherwise require human intelligence, the moral status of artificially intelligent agents, and more. 


Read this blog to learn about Big Data Ethics


The field is still in its infancy, but it has already shown promise in helping us understand how we should deal with certain moral dilemmas. 

In the past few years, there has been a lot of research on how to make AI more ethical. But how can we define ethics for machines? 

AI programmed machines with rules for good behavior and to avoid making bad decisions based on the principles. It is not difficult to imagine that in the future, we will be able to tell if an AI has ethical values by observing its behavior and its decision-making process. 

Three laws of robotics by Isaac for machine ethics are: 

First Law—A robot may not injure a human being or, through inaction, allow a human being to come to harm.  

Second Law—A robot must obey the orders given it by human beings except where such orders would conflict with the First Law.  

Third Law—A robot must protect its own existence if such protection does not conflict with the First or Second Laws. 

Artificial Moral Agents 

The development of artificial moral agents (AMA) is a hot topic in the AI space. The AMA has been designed to be a moral agent that can make moral decisions and act according to these decisions. As such, it has the potential to have significant impacts on human lives. 

The development of AMA is not without ethical issues. The first issue is that AMAs (Artificial Moral Agents) will have to be programmed with some form of morality system which could be based on human values or principles from other sources.  

This means that there are many possibilities for diverse types of AMAs and several types of morality systems, which may lead to disagreements about what an AMA should do in each situation. Secondly, we need to consider how and when these AMAs should be used as they could cause significant harm if they are not used properly 

Closing on AI ethics 

Over the years, we went from, “AI is impossible” (Dreyfus 1972) and “AI is just automation” (Lighthill 1973) to “AI will solve all problems” (Kurzweil 1999) and “AI may kill us all” (Bostrom 2014). 

Several questions arise with the increasing dependency on AI and robotics. Before we rely on these systems further, we must have clarity about what the systems themselves should do, and what risks they have in the long term.  

Let us know in the comments if you also think it also challenges the human view of humanity as the intelligent and dominant species on Earth.  

Top 15 AI startups developing financial services in the USA
Alyshai Nadeem
| September 1, 2022

From managing your cash flow to making lending decisions for you, here is a list of 15 fintech startups using Artificial Intelligence to enhance your experience.

1. Affirm is changing the way people buy stuff 

Affirm logo | Data Science Dojo

Affirm is a consumer application that grants loans for purchases at various retailers. The startup makes use of multiple machine learning algorithms for credit underwriting and happens to be the exclusive buy now, pay later partner for Amazon. 

Max Levchin, the co-founder of PayPal, along with Nathan Gettings, Jeffrey Kaditz, and Alex Rampell introduced Affirm in 2012 to the world. 

Affirm also partnered with Walmart in 2019, allowing customers to access the app in-store and on Walmart’s website. 

Founded: 2012 

Headquarters: San Francisco 

Website: Affirm Official Site

2. HighRadius is automating financial processes

highradius squareLogo 1632289057118 | Data Science Dojo

Fintech startup, HighRadius provides a Software-as-a-Service company (SaaS). The startup makes use of AI-based autonomous systems to help automate Accounts Receivable and Treasury processes. 

HighRadius provides high operational efficiency, accurate cash flow forecasting, and much more to help companies achieve strong ROI. 

Founded: 2006 

Headquarters: Houston, Texas 

Website: HighRadius Official

3. SparkCognition is building smarter, safer, and sustainable solutions for the future

SparkCognition Logo | Data Science Dojo

SparkCognition focuses on creating AI-powered cyber-physical software for the safety, security, and reliability of IT, OT, and the IoT. The startup builds artificial intelligence solutions for applications in energy, oil and gas, manufacturing, finance, aerospace, defense, and security. 

The startup’s work in the financial sector enables businesses to improve analytical accuracy, minimize risks, accelerate reaction time to fluctuating market conditions, and sustain a competitive advantage. 

Previously, SparkCognition enabled a fintech startup to use a machine learning model to detect fraud with 90% accuracy, saving the company over $450K each year. 

Founded: 2013 

Headquarters: Austin, Texas 

Website: SparkCognition Official

4. ZestFinance helps cut losses and increase revenue

ZestFinance Logo | Data Science Dojo
ZestFinance Logo

Another popular name in the financial AI industry, ZestFinance enables companies by helping them increase approval rates, cut credit losses, and improve underwriting using machine learning. 

Moreover, the startup helps lenders predict credit risk so they can increase revenues, reduce risk & ensure compliance. 

The main aim of the startup is to grant fair and transparent credit access to everyone and build an equitable financial system. 

Founded: 2009 

Headquarters: Burbank, California. 

Website: ZestFinance

5. Upstart investigates the financial background and gives you a lower rate of lending

upstart network inc logo vector | Data Science Dojo

Based on a very cool concept, Upstart first checks your education and job history, then helps understand more about your future potential to eventually get the user a lower rate for lending. 

According to the startup itself, they look beyond a person’s credit score for personal loans, car loan refinance, and small business loans.  

Founded: 2012 

Headquarters: San Mateo, California 

Website: Upstart 

6. Vise AI is the financial advisor of the future.

Vise Logo | Data Science Dojo

An AI-driven asset management platform, Vise AI is built and designed specifically as a financial advisory platform.  

The startup builds hyper-personalized portfolios and automates portfolio management. Moreover, they aim to enable financial advisory across businesses so they can focus on developing their clients and growing their businesses. 

Founded: 2019 

Headquarters: New York 

Website: ViseAI 

7. Cape Analytics helps avail accurate insurance quotes

131 1317638 cape analytics logo hd png download | Data Science Dojo

Cape Analytics combines machine learning and geospatial imagery to help identify property attributes that allow insurance companies to provide clients with accurate quotes. 

The main aim of the startup is to provide property details to combat any risks associated with climate, insurance, and real estate. 

Founded: 2014 

Headquarters: Mountain View, California, United States 

Website: Cape Analytics 

8. Clinc is revolutionizing conversational AI, one bank at a time.

ao startups

Clinc develops intelligent personal financial assistants. The platform enables personal and instant answers to any common or complex questions. 

Inspired by conversational AI, Clinc focuses on revolutionizing conversational AI at some of the biggest banks in the world. The startup utilizes NLP which understands how people talk, powering exceptional customer experiences that build loyalty and generate ROI. 

Founded: 2015 

Headquarters: Ann Arbor, Michigan 

Website: Clinc

To learn more about Conversational AI, click here.

9. Sentieo is centralizing financial research tools into a single platform

finance startups logo

Sentieo is an AI-powered financial research startup that develops and distributes a range of systems across the financial world.  

Sentieo is a financial intelligence platform that aims to centralize multiple financial research tools into a single innovative Ai-powered platform. Sentieo helps analysts save time but also discover alpha-driving insights. 

Founded: 2012 

Headquarters: San Francisco, CA 

Website: Sentieo

10. CognitiveScale is industrializing scalable Enterprise AI

logo partner cognitiveScale | Data Science Dojo

Pioneering the concept of ‘AI engineering,’ CognitiveScale aims to industrialize scalable Enterprise AI development and deployment. 

The startup makes use of its award-winning Cortex AI Platform to empower businesses. The startup helps implement trusted decision intelligence into business processes and applications for better customer experience and operational efficiency. 

Founded: 2013 

Headquarters: Austin, Texas 

Website: CofnitiveScale

11. Kyndi is building the world’s first Explainable AI platform

1639074739967 e1661980647365 | Data Science Dojo

AI company Kyndi is trying to build the world’s first Explainable AI platform for governments and commercial institutions.  

The startup hopes to transform business processes by offering auditable AI solutions across various platforms. It is built on the simple policy that higher-performing teams can produce trusted results and better business outcomes. 

Founded: 2014 

Headquarters: San Mateo, CA 

Website: Kyndi

12. NumerAI is bridging the gap between the stock market and Data Scientists

download 2 e1661980711223 | Data Science Dojo

Another startup transforming the financial sector is NumerAI. The startup aims to transform and regularize financial data into machine learning problems for a global network of Data Scientists. 

Given the inefficiency of the stock market concerning developments in machine learning and artificial intelligence, the startup recognized that only a fraction of the world’s Data Scientists have access to its data and create solutions to combat that. 

Founded: 2015 

Headquarters: California Street, San Francisco 

Website: Numerai 

13. Merlon Intelligence is one of the startups that provide financial security through AI

download 1 2 e1661981336216 | Data Science Dojo

Fintech startup, Merlon Intelligence, helps banks by mitigating potential risks and controlling money laundering across multiple platforms. 

The startup makes use of AI to automate adverse media screening. This helps business and financial analysts focus on quicker, more accurate, real-time decisions. 

Founded: 2016 

Headquarters: San Francisco, California 

Website: Merlon Intelligence 

14. Trade Ideas’ virtual research analyst helps with smarter trading

| Data Science Dojo

Trade Ideas built a virtual research analyst that can sift through multiple aspects of business and finances, including technical, fundamental, social, and much more. The virtual assistant sifts through thousands of trades every day to help find the highest probability. 

The startup makes use of thousands of data centers and makes them play with different trading scenarios every single day. 

Founded: 2002 

Headquarters: San Diego County, California 

Website: Trade Ideas

15. Datrics is democratizing self-service financial analytics using data science

ef236836036b1f6820fd3b8b526c35a057f238cd e1661981394334 | Data Science Dojo

Fintech startup, Datrics helps democratize self-service analytics as well as machine learning solutions by providing an easy-to-use drag-and-drop interface. 

Datrics provides a no-code platform that can easily generate analytics and data science. The startup makes use of data-driven decision-making that allows enterprises to make better use of their financial service analytics. 


Headquarters: Delaware, United States 

Website: Datrics

Also, read about 10 AI startups revolutionizing healthcare you should know about

If you would like to learn more about Artificial Intelligence, click here.

Is there any other AI-based fintech startup that you would like us to talk about? Let us know in the comments below. For similar listicles, click here.

10 AI startups revolutionizing healthcare you should know about
Alyshai Nadeem
| August 30, 2022

Healthcare is a necessity for human life, yet many do not have access to it. Here are 10 startups that are using AI to change healthcare.

Healthcare is a necessity that is inaccessible to many across the world. Despite rapid developments and improvements in medical research, healthcare systems have become increasingly unaffordable.

However, multiple startups and tech companies have been trying their best to integrate AI and machine learning for improvements in this sector.

As the population of the planet increases along with life expectancy due to advancements in agriculture, science, medicine, and more, the demand for functioning healthcare systems also rises.

According to McKinsey & Co., by the year 2050, in Europe and North America, 1 in 4 people will be over the age of 65 Source). Healthcare systems by that time will have to manage numerous patients with complex needs.

Read about Top 15 AI startups developing financial services in the USA

Here is a list of a few Artificial Intelligence (AI) startups that are trying their best to revolutionize the healthcare industry as we know it today and help their fellow human beings:

1. Owkin aims to find the right drug for every patient.

owkin logo

Originating in Paris, France, Owkin was launched in 2016 and develops a federated learning AI platform, that helps pharmaceutical companies discover new drugs, enhance the drug development process, and identify the best drug for the ‘right patient.’ Pretty cool, right?

Owkin makes use of different machine learning models to test AI models on distributed data.

The startup also aims to empower researchers across hospitals, educational institutes, and pharmaceutical companies to understand why drug efficacy varies from patient to patient.

Read more about this startup, here.

2. Overjet is providing accurate data for better patient care and disease management.

overjet logo

Founded by PhDs from the Massachusetts Institute of Technology and dentists from Harvard School of Dental Medicine in 2018, Overjet is changing the playground in dental AI.

Overjet makes use of AI to make use of dentist-level understanding of the subject for the identification of diseases and their progression into software.

Overjet aims to provide effective and accurate data to dentists, dental groups, and insurance companies so that they can provide the best patient care and disease management.

You can learn more about the startup, here.

3. From the mid-Atlantic health system to an enterprise-wide AI workforce, Olive AI is improving operational healthcare efficiency.

OliveAI logo

Founded in 2012, Olive AI is the only known AI as a Service (AIaaS) built for the healthcare sector. The premier AI startup utilizes the power of cloud computing by implementing Amazon Web Services (AWS) and automating systems that accelerate time to care.

With more than 200 enterprise customers such as health systems, insurance companies, and a growing number of healthcare companies. Olive AI assists healthcare workers with time-consuming tasks like prior authorizations and patient verifications.

Find out more about Olive AI, click here.

Want to learn more about AI as a Service? Click here.

4. Insitro provides better medicines for patients with the overlap of biology and machine learning.

insitro logo

The perfect cross between biology and machine learning, Insitro aims to support pharmaceutical research and development, and improve healthcare services. Founded in 2018, Insitro promotes Machine Learning-Based Drug Discovery for which it has raised a substantial amount of funding over the years.

According to a recent Forbes ranking of the top 50 AI businesses, the HealthTech startup is ranked at 35 for having the most promising AI-based medication development process.

Further information on Insitro can be found here.

5. Caption Health makes early disease detection easier.


caption health

Founded in 2013, Caption Health has since been a top provider of medical artificial intelligence. The startup is responsible for the early identification of illnesses.

Caption Health was the first to provide the FDA-approved AI imaging and guiding software for cardiac ultrasonography. The startup has helped remove numerous barriers to treatment and enabled a wide range of people to perform heart scans of diagnostic quality.

Caption Health can be reached out here.

6. InformAI is trying to transform the way healthcare is delivered and improve patient outcomes.

InformAI logo

Founded in 2017, InformAI expedites medical diagnosis while increasing the productivity of medical professionals.

Focusing on AI and deep learning, as well as business analytics solutions for hospitals and medical companies, InformAI was built for AI-enabled medical image classification, healthcare operations, patient outcome predictors, and much more.

InformAI not only has top-tier medical professionals at its disposal, but also has 10 times more access to proprietary medical datasets, as well as numerous AI toolsets for data augmentation, model optimization, and 3D neural networks.

The startup’s incredible work can be further explored here.

7. Recursion is decoding biology to improve lives across the globe.

recursion logo

A biotechnology startup, Recursion was founded in 2013 and focuses on multiple disciplines, ranging from biology, chemistry, automation, and data science, to even engineering.

Recursion focuses on creating one of the largest and fastest-growing proprietary biological and chemical datasets in the world.

To learn more about the startup, click here

8. Remedy Health provides information and insights for better navigation of the healthcare industry.

Remedy logo

As AI advances, so does the technology that powers it. Another marvelous startup known as Remedy Health is allowing people to conduct phone screening interviews with clinically skilled professionals to help identify hidden chronic conditions.

The startup makes use of virtual consultations, allowing low-cost, non-physician employees to proactively screen patients.

To learn more about Remedy Health, click here.

9. Sensely is transforming conversational AI.

sensely logo

Founded in 2013, Sensely is an avatar and chatbot-based platform that aids insurance plan members and patients.

The startup provides virtual assistance solutions to different enterprises including insurance and pharmaceutical companies, as well as hospitals to help them converse better with their members.

Sensely’s business ideology can further be explored here.

10. Oncora Medical provides a one-stop solution for oncologists.

oncoro medical logo

Another digital health company, founded in 2014, Oncora Medical focuses on creating a crossover between data and machine learning for radiation oncology.

The main aim of the startup was to create a centralized platform for better collection and application of real-world data that can in some way help patients.

Other details on Oncora Medical can be found here.


With the international AI in the healthcare market expected to reach over USD 36B by the year 2025, it is only accurate to expect that this market and specific niche will continue to grow even further.

If you would like to learn more about Artificial Intelligence, click here.

Was there any AI-based healthcare startup that we missed? Let us know in the comments below. For similar listicles, click here.

This blog discusses the applications of AI in healthcare. We will learn about some businesses and startups that are using AI to revolutionize the healthcare industry. This advancement in AI has helped in fighting against Covid19.


COVID-19 was first recognized on December 30, 2019, by BlueDot. It did so nine days before the World Health Organization released its alert for coronavirus. How did BlueDot do it? BlueDot used the power of AI and data science to predict and track infectious diseases. It identified an emerging risk of unusual pneumonia happening around a market in Wuhan.

The role of data science and AI in the Healthcare industry is not limited to that. Now, it has become possible to learn the causes of whatever symptoms you are experiencing, such as cough, fever, and body pain, without visiting a doctor and self-treating it at home. Platforms like Ada Health and Sensely can diagnose the symptoms you report.

The Healthcare industry generates 30% of 1.145 trillion MB of data generated every day. This enormous amount of data is the driving force for revolutionizing the industry and bringing convenience to people’s lives.

Applications of Data Science in Healthcare:

1. Prediction and spread of diseases

Predictive analytics process

Predictive analysis, using historical data to find patterns and predict future outcomes, can find the correlation between symptoms, patients’ habits, and diseases to derive meaningful predictions from the data. Here are some examples of how predictive analytics plays a role in improving the quality of life and medical condition of the patients:

  • Magic Box, built by the UNICEF office of innovation, uses real-time data from public sources and private sector partners to generate actionable insights. It provides health workers with disease spread predictions and countermeasures. During the early stage of COVID-19, Magic box correctly predicted which the African States are most likely to see imported cases using airline data. This prediction proved beneficial in planning and strategizing quarantine, travel restrictions, and enforcing social distancing.
  • Another use of analytics in healthcare is AIME. It is an AI platform that helps health professionals in tackling mosquito-borne diseases like dengue. AIME uses data like health center notification of dengue, population density, and water accumulation spots to predict outbreaks in advance with an accuracy of 80%. It aids health professionals in Malaysia, Brazil, and the Philippines. The Penang district of Malaysia saw a cost reduction of USD 500,000 by using AIME.
  • BlueDot is an intelligent platform that warns about the spread of infectious diseases. In 2014, it identified the Ebola outbreak risk in West Africa accurately. It also predicted the spread of the Zika virus in Florida six months before the official reports.
  • Sensely uses data from trusted sources like Mayo Clinic and NHS to diagnose the disease. The patient enters symptoms through a chatbot used for diagnosis. Sensely launched a series of customized COVID-19 screening and education tools with enterprises around the world which played their role in supplying trusted advice urgently.

Want to learn more about predictive analytics? Join our Data Science Bootcamp today.

2. Optimizing clinic performance

According to a survey carried out in January 2020, 85 percent of the respondents working in smart hospitals reported being satisfied with their work compared to 80 percent of the respondents from digital hospitals. Similarly, 74 percent of the respondents from smart hospitals would recommend the medical profession to others, while only 66 percent of the respondents from digital hospitals recommend it.

Staff retention has been a challenge but is now becoming an enormous challenge, especially post-pandemic. For instance, after six months of the COVID-19 outbreak, almost a quarter of care staff quit their job in Flanders & Belgium. The care staff felt exhausted, experienced sleep deprivation, and could not relax properly. A Smart healthcare system can solve these issues.

Smart healthcare systems can help optimize operations and provide prompt service to patients. It forecasts the patient load at a particular time and plans resources to improve patient care. It can optimize clinic staff scheduling and supply, which reduces the waiting time and overall experience.

Getting data from partners and other third-party sources can be beneficial too. Data from various sources can help in process management, real-time monitoring, and operational efficiency. It leads to overall clinic performance optimization. We can perform deep analytics of this data to make predictions for the next 24 hours, which helps the staff focus on delivering care.

3. Data science for medical imaging

According to the World Health Organization (WHO), radiology services are not accessible to two-thirds of the world population. Patients must wait for weeks and travel distances for simple ultrasound scans. One of the foremost uses of data science in the healthcare industry is medical imaging. Data Science is now used to inspect images from X-rays, MRIs, and CT scan to find irregularities. Traditionally, radiologists did this task manually, but it was difficult for them to find microscopic deformities. The patient’s treatment depends highly on insights gained from these images.

Data science can help radiologists with image segmentation to identify different anatomical regions. Applying some image processing techniques like noise reduction & removal, edge detection, image recognition, image enhancement, and reconstruction can also help with inspecting images and gaining insights.

One example of a platform that uses data science for medical imaging is Medo. It provides a fully automated platform that enables quick and accurate imaging evaluations. Medo transforms scans taken from different angles into a 3D model. They compare this 3D model against a database of millions of other scans using machine learning to produce a recommended diagnosis in real-time. Platforms like Medo make radiology services more accessible to the population worldwide.

4. Drug discovery with data science

Traditionally, it took decades to discover a new drug, but the time has now been reduced to less than a year using data science. Drug discovery is a complex task. Pharmaceutical industries rely heavily on data science to develop better drugs. Researchers need to identify the causative agent and understand its characteristics which may require millions of test cases to understand. This is a huge problem for pharmaceutical companies because it can take decades to perform these tests. Data science solved this problem and can perform this task in a month or even a few weeks.

For example, the causative agent for COVID-19 is the SARS-CoV-2 virus. For discovering an effective drug for COVID-19, deep learning is used to identify and design a molecule that binds to SARS-CoV-2 to inhibit its function by using extracted data from scientific literature through NLP (Natural Language Processing).

5. Monitoring patients’ health

The human body generates two terabytes of data daily. Humans are trying to collect most of this data using smart home devices and wearables. The data these devices collect includes heart rate, blood sugar, and even brain activity. Data can revolutionize the healthcare industry if known how to use it.

Every 36 seconds, a person dies from cardiovascular disease in the United States. Data science can identify common conditions and predict disorders by identifying the slightest change in the health indicators. Timely alert of changes in health indicators can save thousands of lives. Personal health coaches are designed to help to gain deep insights into the patient’s health and alert if the health indicator reaches a dangerous level.

Companies like Corti can detect cardiac arrest in 48 seconds through phone calls. This solution uses real-time natural language processing to listen to emergency calls and look out for several verbal and non-verbal patterns of communication. It is trained on a dataset of emergency calls and acts as a personal assistant of the call responder. It helps the responder ask relevant questions, provide insights, and predict if the caller is suffering from cardiac arrest. Corti finds cardiac arrest more accurately and faster than humans.

6. Virtual assistants in healthcare

The WHO estimated that by 2030, the world will need an extra 18 million health workers worldwide. Using virtual assistant platforms can fulfill this need. According to a survey by Nuance, 92% of clinicians believe virtual assistant capabilities would reduce the burden on the care team and patient experience.

Patients can enter their symptoms as input to the platform and ask questions. The platform would tell you about your medical condition using the data of symptoms and causes. It is possible because of the predictive modeling of disease. These platforms can also assist patients in many other ways, like reminding them to take medication on time.

An example of such a platform is Ada Health, an AI-enabled symptom checker. A person enters symptoms through a chatbot, and Ada uses all available data from patients, past medical history, EHR implementation, and other sources to predict a potential health issue. Over 11 million people (about twice the population of Arizona) use this platform.

Other examples of health chatbots are Babylon Health, Sensely, and Florence.


In this blog, we discussed the applications of AI in healthcare. We learned about some businesses and startups that are using AI to revolutionize the healthcare industry. This advancement in AI has helped in fighting against Covid19. To learn more about data science enroll in our Data Science Bootcamp, a remote instructor-led Bootcamp where you will learn data science through a series of lectures and hands-on exercises. Next, we will be creating a prognosis prediction system in python. You can follow along with my next blog post here.

Follow Along

Want to create data science applications with python? checkout our Python for Data Science training. 

Recommender systems are one of the most popular algorithms in data science today. Learn how to build a simple movie recommender system.

Recommender systems possess immense capability in various sectors ranging from entertainment to e-commerce. Recommender Systems have proven to be instrumental in pushing up company revenues and customer satisfaction with their implementation. Therefore, it is essential for machine learning enthusiasts to get a grasp on it and get familiar with related concepts.

As the amount of available information increases, new problems arise as people are finding it hard to select the items they actually want to see or use. This is where the recommender system comes in. They help us make decisions by learning our preferences or by learning the preferences of similar users.

They are used by almost every major company in some form or the other. Netflix uses it to suggest movies to customers, YouTube uses it to decide which video to play next on autoplay, and Facebook uses it to recommend pages to like and people to follow.

This way recommender systems have helped organizations retain customers by providing tailored suggestions specific to the customer’s needs. According to a study by McKinsey, 35 percent of what consumers purchase on Amazon and 75 percent of what they watch on Netflix come from product recommendations based on such algorithms.

Netflix - Product recommender systems
Audience watch Netflix and Youtube on recommendations – Recommender systems

Recommender systems can be classified under 2 major categories: Collaborative Systems and Conent-Based Systems.

Collaborative systems

Collaborative systems provide suggestions based on what other similar users liked in the past. By recording the preferences of users, a collaborative system would cluster similar users and provide recommendation based on the activity of users within the same group.

Content-based systems

Content-Based systems provide recommendation based on what the user liked in the past. This can be in the form of movie ratings, likes and clicks. All the recorded activity allows these algorithms to provide suggestions on products if they possess similar features to the products liked by the user in the past.

Content based system
Content based system provide recommendation based on user’s liked content in the past
A hands-on practice, in R, on recommender systems will boost your skills in data science by a great extent. We’ll first practice using the MovieLens 100K Dataset which contains 100,000 movie ratings from around 1000 users on 1700 movies. This exercise will allow you to recommend movies to a particular user based on the movies the user has already rated. We’ll be using the recommender lab package which contains a number of popular recommendation algorithms.

After completing the first exercise, you’ll have to use recommender lab to recommend music to the customers. We use the last.fm dataset that has 92,800 artist listening records from 1892 users. We are going to recommend artists to a user that the user is highly likely to listen.

Install and import required libraries


Import data

The recommenderlab frees us from the hassle of importing the MovieLens 100K dataset. It provides a simple function below that fetches the MovieLens dataset for us in a format that will be compatible with the recommender model. The format of MovieLense is an object of class “realRatingMatrix” which is a special type of matrix containing ratings. The data will be in form of a sparse matrix with the movie names in the columns and User IDs in the rows. The interaction of User ID and a particular movie will provide us the rating given by that particular user from a scale of 1-5.

As you will see in the output after running the code below, the MovieLense matrix will consists of 943 users (rows) and 1664 movies (columns) with overall 99392 ratings given.

Rating matrix

Data summary

By running the code below, we will visualize a small part of the dataset for our understanding. The code will only display the first 10 rows and 10 columns of our dataset. You can notice that the scores given by the users are integers ranging from 1-5. You’ll also note that most of the values are missing (marked as ‘NA’) indicating that the user hasn’t watched or rated that movie.

ml10 <- MovieLense[c(1:10),]
ml10 <- ml10[,c(1:10)]
as(ml10, "matrix")
MovieLense data matrix
MovieLense data matrix of 100 rows and 100 columns

With the code below, we’ll visualize the MovieLens data matrix of the first 100 rows and 100 columns in the form of a heatmap. Run this code to visualize the movie ratings with respect to combination of respective rows and columns.

Visualize movie ratings in the form of heatmap


We will now train our model using recommenderlab‘s Recommender function below. The function learns a recommender model from the given data. In this case our data is the MovieLens data. In the parameters, we are going to specify one of the several algorithms offered by recommenderlab for learning. Here we’ll choose UBCF – User based Collaborative-Filtering. Collaborative filtering uses given rating data by many users for many items as the basis for predicting missing ratings and/or for creating a top-N recommendation list for a given user, called the active user.

train <- MovieLense
our_model <- Recommender(train, method = "UBCF")
our_model #storing our model in our_model variable

Collaborative filtering


We will now move ahead and create predictions. From our interaction matrix which is in our dataset MovieLens, we will predict the score for the movies the user hasn’t rated using our recommender model and list the top scoring movies that our model scored. We will use recommenderlab’s predict function that creates recommendations using a recommender model, our_model in this case, and data about new users.

We will be predicting for a specified user. Below, we have specified a user with ID 115. We have also set n = 10 as our parameter to limit the response to the top 10 ratings given by our model. These will be the movies our model will recommend to the specified user based on his previous ratings.

User = 115
pre <- predict(our_model, MovieLense[User], n = 10)

predicting model to specified user- recommending

List already liked

In the code below we will list the movies the user has already rated and display the score he gave.

user_ratings <- train[User]
as(user_ratings, "list")
List of movies user liked - for recommender system
Movies list rated by users

View result

In the code below, we will display the predictions created in our pre variable. We will display it in form of a list.


predictions of pre variable


Using the recommenderlab library we just created a movie recommender system based on the collaborative filtering algorithm. We have successfully recommended 10 movies that the user is likely to prefer. The recommenderlab library could be used to create recommendations using other datasets apart from the MovieLens dataset. The purpose of the exercise above was to provide you a glimpse of how these models function.

Practice with lastFM dataset

For more practice with recommender systems, we will now recommend artists to our users. We will use the LastFM dataset. This dataset contains social networking, tagging, and music artist listening information from a set of 2K users from Last.fm online music system. It contains almost 92,800 artist listening records from 1892 users.

We will again use the recommenderlab library to create our recommendation model. Since this dataset cannot be fetched using any recommenderlab function as we did for the MovieLens dataset, we will manually fetch the dataset and practice converting it to the realRatingMatrix which is the format that our model will input for modeling.

Below we’ll import 2 files, the user_artists.dat file and artists.dat into the user_artist_data and artist_data variables respectively. The user_artists.dat file is a tab separated file that contains the artists listened by each user. It also provides a listening count for each [user, artist] pair marked as attribute weight. The artists.dat file contains information about music artists listened and tagged by the users. It is a tab separated file that contains the artist id, its name, URL and picture URL. It is available on this link to the zip file.

Lets import our dataset below:

user_artist_data <- read.csv(file = PATH + "user_artists.dat", header = TRUE, sep="\t")
artist_data <- read.csv(file = PATH + "artists.dat", header = TRUE, sep="\t")

Following the steps as we did with our Movie Recommender system, we’ll view the first few rows of our dataset by using the head method.

Head method
Movie recommender system – head method
We’ll use the head method to view the first 10 rows of the artist dataset below. Think which columns will be useful for our purpose as we’ll be using collaborative filtering method for designing our model.

head method of 10 rows of artists below

In the code below, we will use the acast method to convert our user_artist dataset into an interaction matrix. This will be later converted to a matrix and then to realRatingMatrix. The realRatingMatrix is the format which will be taken by recommenderlab‘s Recommender function. It is a matrix containing ratings, typically 1-5 stars, etc. We will store in it our rrm_data variable. After running the code, you’ll notice that the output provides us the dimensions and class of our variable rrm_data.

m_data <- acast(user_artist_data, userID~artistID)
m_data <- as.matrix(m_data)
rrm_data <- as(m_data,"realRatingMatrix")

acast method

Let’s visualize the user_artist data matrix of the first 100 rows and 100 columns in form of a heatmap. Write a single line code with rrm_data variable to visualize the movie ratings with respect to combination of respective rows and columns using the image function.

Hint: image(rrm_data[1:100,1:100])
Visualize the movie ratings with respect to combination of respective rows and columns

Using a similar procedure as we used to build our model for movie recommender system, write a code that builds our Recommender method of the recommenderlab library using the “UBCF” algorithm. Store the model in a variable named artist_model.

We’ll use the predict function to create a prediction for UserID 114 and store the prediction in variable artist_pre. Also note that we need the top 12 predictions for listed. The function below will list our prediction using the as method.

train <- rrm_data
artist_model <- Recommender(train, method = "UBCF")
User = 114
artist_pre <- predict(artist_model, rrm_data[User], n = 10)

Recommendations of 1 user


UserID 114

To work with more interesting datasets for recommender systems using recommender lab or any other relevant library, refer to the article 9 Must-Have Datasets for Investigating Recommender Systems published on kdnuggets.com.


Want to dive deeper into recommender systems? Check out Data Science Dojo’s online data science certificate program

Saumya Soni
| December 15, 2020

Learn how to create a bird recognition app using Custom Vision AI and Power BI for application to track the effect of climate change on bird populations.

Imagine a world without birds: the ecosystem would fall apart, bug populations would skyrocket, erosion would be catastrophic, crops would be torn down by insects, and so many other damages. Did you know that 1,200 species are facing extinction over the next century, and many more are suffering from severe habitat loss? (source).

Birds are fascinating and beautiful creatures who keep the ecosystem organized and balanced. They have emergent properties that help them react spontaneously in many situations, which are unique to other organisms.

Here are some fun facts: Parasitic jaegers ( a type of bird species) obtain food by stealing it directly from the beaks of other birds. The Bassian Thrush finds its food using the most unique way possible: they have adapted their foraging methods to depend on creating a large amount of gas to surprise earworms and trigger them to start moving (so the birds can find and eat it).

Due to the intriguing behaviors of birds, I got inspired and lifted to create an app that could identify any bird which you are captivated by in real-time. I also built this app to raise awareness of the heart-breaking reality that most birds face around the world.

Global trends of bird species survival chart

I first researched bird populations and their global trends from the data that contains the information of the past 24 years. I then analyzed this data set and created interactive visuals using Power BI.

This chart displays the Red List Index (RLI) of species survival from 1988 to 2012. RLI values range from 1 (no species at risk of extinction in the near term) down to 0 (all species are extinct).

As you click on the Power BI Line Chart you will notice that since 1988, bird species have faced a steadily increasing risk of extinction in every major region of the world (change being more rapid in certain regions). 1 in 8 currently known bird species in the world are at the threshold of extinction. The main reasons are degradation/loss of habitat (due to deforestation, sea-level rise, more frequent wildfires, droughts, flooding, loss of snow and ice, and more), bird trafficking, pollution, and global warming. As figured, most of these are a result of us humans.

Due to industrialization, more than 542,390,438 birds have lost their lives. Climate change is causing the natural food chain to fall apart. Birds starve with lesser food (therefore must fly longer distances), choke on human-made pollutants, and end up becoming weaker. Change is necessary, and with change comes compassion. This web app can help to build an understanding and empathy toward birds.

Let’s look at the Power BI reports and the web app:

Power BI report: Bird attributes / Bird Recognition

As you can see in this report, along with recognizing a specific bird in real-time, interactive visualizations from Power BI display the unique attributes and information about each bird and its status in the wild. The fun facts on the visualization about each bird will linger in your mind for days.

AI web app – To create a bird recognition app

In this webapp, I used cognitive services to upload the images (of the 85 bird species), tagged them, trained the model, and evaluated the results. With Microsoft Custom Vision AI, I could

train the model to recognize 85 bird species. You can upload an image from your file explorer, and it will then predict the species name of the bird and the accuracy tied to that tag.

The Custom Vision Service uses machine learning to classify the images I uploaded. The only thing I was required to do was specify the correct tag for each image. You can also tag thousands of images at a time. The AI algorithm is immensely powerful as it gives us great accuracy and once the model is trained, we can use the same model to classify new images according to the needs of our app.

  1. Choose a bird image from your PC
  2. Upload a bird image URL
  3. Take a picture of a bird in real-time (only works on the phone app as described later in the blog)

Once you upload an image, it will call the Custom Vision Prediction API (which was already trained by Custom Vision, powered by Microsoft) to get the species of the bird.

Bird recognition using AI
Measure the effect of climate change on birds

Phone application:  

I also created a phone application, called ‘AI for Birds’, that you can use with camera integration for taking pictures of birds in real-time. After using the built-in camera to take a picture, the name of the bird species will be identified and shown. As of now, I added 85 bird species into the AI model, however that number will increase.

The journey of building my own custom model, training it, and deploying it has been noteworthy. Here is the link to my other blog for how to build your own AI custom model. You can also follow along with these steps and use it as a tutorial: Instructions for how to create Power BI reports and publish them to the web will also be provided in the other blog.


The grim statistics are not just sad news for bird populations. They are sad news for the planet because the health of bird species is a key- measure for the state of ecosystems and biodiversity on planet earth in general.

I believe in: Exploring- Learning- Teaching- Sharing. There are several thousands of other bird species that are critical to biodiversity on planet earth.

Consider looking at my app and supporting organizations that work to fight the constant threats of habitat destruction and global warming today.

Our Earth is full of unique birds which took millions of years to evolve into the striking bird species we see today. We do not want to destroy organisms which took millions of years to evolve in just a couple of decades.


Rebecca Merrett
| November 22, 2017
Self-driving car ethics require proper study, training, and attention to detail. We must understand the ethical concerns of autonomous technology to minimize risk. 

New technology, new problems

When it comes to autonomous technology of any kind, the first thing that often comes to our minds is our safety, our well-being, and our survival. What are the self-driving car ethics? It’s not ridiculous for us to have these concerns. First, we should ask the hard questions-

Who is responsible should a death result from an edge case accident?

What is an acceptable level of autonomy and what isn’t?

How does this technology come to a decision?

Second, with driverless cars – a prime example of autonomous technology – starting to be deployed on public roads across the world, we must seek answers to these questions sooner rather than later. The ethical dilemmas we face with driverless cars now will be similar to the ethical dilemmas we will face later on. Facing these issues head-on now could help us get a head start on the many ethical issues we will need to face as technology becomes ever-more high-tech.

Confronting the self-driving ethical issues

Currently, MIT researchers are confronting the ethical dilemmas of driverless cars by playing out hypothetical scenarios. In time, when it comes to autonomous cars making decisions on the safety of their passengers and people the car contacts on the street, how do they choose between the lesser of two evils? Then, the viewer must judge which decision they would make if placed in a particularly intense scenario. Eventually, this data is then compared with others and made publicly available.

In the meantime, researchers are gathering many people’s views on what is considered acceptable and not acceptable behavior of an autonomous car. So, what leads to the impossible choice of sacrificing one life over another’s? As alarming as it is, this research could be used to help data scientists and engineers.

This will help them gain a better understanding of what actions might be taken should a far-fetched accident occur. Of course, avoiding the far-fetched accident in the first place is a bigger priority. Furthermore, the research is a step toward facing the issue head-on rather than believing that engineering alone is going to solve the problem.

Visit: Data Science Dojo to learn more about algorithms

Decision making alternatives

Meanwhile, some proposed ideas for minimizing the risk of self-driving car accidents include limiting the speed of autonomous cars beyond that speed limit. This is in certain densely populated areas and has a designated right of way for these cars.

More sophisticated mechanisms for this include using machine learning to continuously assess the risk of an accident and predict the probability of an accident occurring so that action can be taken preemptively to avoid such a situation.  The Center for Autonomous Research at Stanford (a name suspiciously chosen for its acronym CARS, it seems) is looking into these ideas for “ethical programming.”

Putting in place ethical guidelines for all those involved in the build, implementation, and deployment of driverless cars is another step towards dealing with ethical dilemmas. For example, the Federal Ministry of Transport and Digital Infrastructure in Germany released ethical guidelines for driverless cars this year. The ministry plans to enforce these guidelines to help ensure driverless cars adhere to certain expectations in behaviors.

For example, one guideline prohibits the classification of people based on their characteristics such as race and gender so that this does not influence decision-making should an accident occur.

Next, transparency in the design of driverless cars and how algorithms come to a decision needs to be looked at. Then, we will work through the ethical dilemmas of driverless cars and other autonomous technology. This includes consumers of these cars, and the general public, who have a right to contribute to the algorithms and models that come to a decision.


Factors to consider

A child, for example, might have a stronger weight than a full-grown adult when it comes to a car deciding who gets priority in safety and survival. A pregnant woman, for example, might be given priority over a single man. Humans are the ones who will need to decide what kinds of weights are placed on what kinds of people, and research like MIT’s simulations of hypothetical scenarios is one way of letting the public openly engage in the design and development of these vehicles.

Where do we go from here?

In conclusion, as data scientists, we hold great responsibility when building models that directly impact people’s lives. The algorithms, smarts, rules, and logic that we create are not too far off from a doctor working in an emergency who has to make critical decisions in a short amount of time.

Lastly, understanding the ethical concerns of autonomous technology, implementing ways to minimize risk, and then programming the hard decisions is by no means a trivial task. For this reason, self-driving car ethics require proper study, training, and attention to detail.

Nathan Piccini
| February 20, 2019

Raja Iqbal, Chief Data Scientist and CEO of Data Science Dojo, held a community talk on AI for Social Good. Let’s look at some key takeaways.

This discussion took place on January 30th in Austin, Texas.  Below, you will find the event abstract and my key takeaways from the talk.I’ve also included the video at the bottom of the page.

Event abstract

“It’s not hard to see machine learning and artificial intelligence in nearly every app we use – from any website we visit, to any mobile device we carry, to any goods or services we use. Where there are commercial applications, data scientists are all over it. What we don’t typically see, however, is how AI could be used for social good to tackle real-world issues such as poverty, social and environmental sustainability, access to healthcare and basic needs, and more.

What if we pulled together a group of data scientists working on cutting-edge commercial apps and used their minds to solve some of the world’s most difficult social challenges? How much of a difference could one data scientist make let alone many?

In this discussion, Raja Iqbal, Chief Data Scientist and CEO of Data Science Dojo, will walk you through the different social applications of AI and how many real-world problems are begging to be solved by data scientists.  You will see how some organizations have made a start on tackling some of the biggest problems to date, the kinds of data and approaches they used, and the benefit these applications have had on thousands of people’s lives. You’ll learn where there’s untapped opportunity in using AI to make impactful change, sparking ideas for your next big project.”

1. We all have a social responsibility to build models that don’t hurt society or people

2. Data scientists don’t always work with commercial applications

  • Criminal Justice – Can we build a model that predicts if a person will commit a crime in the future?
  • Education – Machine Learning is being used to predict student churn at universities to identify potential dropouts and intervene before it happens.
  • Personalized Care – Better diagnosis with personalized health care plans

3. You don’t always realize if you’re creating more harm than good.

“You always ask yourself whether you could do something, but you never asked yourself whether you should do something.”

4. We are still figuring out how to protect society from all the data being gathered by corporations.

5. There is not a better time for data analysis than today. APIs and SKs are easy to use. IT services and data storage are significantly cheaper than 20 years ago, and costs keep decreasing.

6. Laws/Ethics are still being considered for AI and data use. Individuals, researchers, and lawmakers are still trying to work out the kinks. Here are a few situations with legal and ethical dilemmas to consider:

  • Granting parole using predictive models
  • Detecting disease
  • Military strikes
  • Availability of data implying consent
  • Self-driving car incidents

7. In each stage of data processing there are possible issues that arise. Everyone has inherent bias in their thinking process which effects the objectivity of data.

8. Modeler’s Hippocratic Oath

  • I will remember that I didn’t make the world and it doesn’t satisfy my equations.
  • Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.
  • I will never sacrifice reality for elegance without explaining why I have done so.
  • I will not give the people who use my model false comfort about accuracy. Instead, I will make explicit its assumptions and oversights.
  • I understand that my work may have an enormous impact on society and the economy, many of them beyond my comprehension.
  • I will aim to show how my analysis makes life better or more efficient.

Highlights of AI for social good

Related Topics

Programming Language
Machine Learning
Events and Conferences
DSD Insights
Development and Operations
Data Visualization
Data Security
Data Science
Data Engineering
Data Analytics
Computer Vision

Finding our reads interesting?

Become a contributor today and share your data science insights with the community

Up for a Weekly Dose of Data Science?

Subscribe to our weekly newsletter & stay up-to-date with current data science news, blogs, and resources.