Data Science at Home

By Francesco Gadaleta

Listen to a podcast, please open Podcast Republic app. Available on Google Play Store.


Category: Podcasting

Open in iTunes


Open RSS feed


Open Website


Rate for this podcast


Description

Technology, machine learning and algorithms

Episode Date
Episode 43: Applied Text Analysis with Python (interview with Rebecca Bilbro)
00:36:32

Today’s episode is about text analysis with python.
Python is the de facto standard in machine learning. A large community, a generous choice in the set of libraries, at the price of less performant tasks, sometimes. But overall a decent language for typical data science tasks.

I am with Rebecca Bilbro, co-author of Applied Text Analysis with Python, with Benjamin Bengfort and Tony Ojeda.

We speak about the evolution of applied text analysis, tools and pipelines, chatbots.

 

Aug 14, 2018
Episode 42: Attacking deep learning models (rebroadcast)
00:29:04

Attacking deep learning models

Compromising AI for fun and profit

 

Deep learning models have shown very promising results in computer vision and sound recognition. As more and more deep learning based systems get integrated in disparate domains, they will keep affecting the life of people. Autonomous vehicles, medical imaging and banking applications, surveillance cameras and drones, digital assistants, are only a few real applications where deep learning plays a fundamental role. A malfunction in any of these applications will affect the quality of such integrated systems and compromise the security of the individuals who directly or indirectly use them.

In this episode, we explain how machine learning models can be attacked [...]

Aug 07, 2018
Episode 41: How can deep neural networks reason
00:18:03

Today’s episode  will be about deep learning and reasoning. There has been a lot of discussion about the effectiveness of deep learning models and their capability to generalize, not only across domains but also on data that such models have never seen.

But there is a research group from the Department of Computer Science, Duke University that seems to be on something with deep learning and interpretability in computer vision.

 

References

Prediction Analysis Lab Duke University https://users.cs.duke.edu/~cynthia/lab.html [...]

Jul 31, 2018
Episode 40: Deep learning and image compression
00:17:20

Today’s episode  will be about deep learning and compression of data, and in particular compressing images. We all know how important compressing data is, reducing the size of digital objects without affecting the quality.
As a very general rule, the more one compresses an image the lower the quality, due to a number of factors like bitrate, quantization error, etcetera. I am glad to be here with Tong Chen,  researcher at the School of electronic Science and Engineering of Nanjing University, China.

Tong developed a deep learning based compression algorithm for images, that seems to improve over state of the art approaches like BPG, JPEG2000 and JPEG.

 

Reference

[...]

Jul 24, 2018
Episode 39: What is L1-norm and L2-norm?
00:21:55

In this episode I explain the differences between L1 and L2 regularization that you can find in function minimization in basically any machine learning model.

 

Jul 19, 2018
Episode 38: Collective intelligence (Part 2)
00:46:36

In the second part of this episode I am interviewing Johannes Castner from CollectiWise, a platform for collective intelligence.
I am moving the conversation towards the more practical aspects of the project, asking about the centralised AGI and blockchain components that are essential part of the platform.

 

References

  1. Opencog.org
    Thaler, Richard H., Sunstein, Cass R. and Balz, John P. (April 2, 2010). “Choice Architecture”.
    doi:10.2139/ssrn.1583509.
Jul 17, 2018
Episode 38: Collective intelligence (Part 1)
00:30:58

This is the first part of the amazing episode with Johannes Castner, CEO and founder of CollectiWise. Johannes is finishing his PhD in Sustainable Development from Columbia University in New York City, and he is building a platform for collective intelligence. Today we talk about artificial general intelligence and wisdom.

All references and shownotes will be published after the next episode.
Enjoy and stay tuned!

Jul 12, 2018
Episode 37: Predicting the weather with deep learning
00:26:25

Predicting the weather is one of the most challenging tasks in machine learning due to the fact that physical phenomena are dynamic and riche of events. Moreover, most of traditional approaches to climate forecast are computationally prohibitive.
It seems that a joint research between the Earth System Science at the University of California, Irvine and the faculty of Physics at LMU Munich has an interesting improvement on the scalability and accuracy of climate predictive modeling. The solution is… superparameterization and deep learning.

 

References                  

Could Machine Learning Break the Convection Pa [...]

Jul 09, 2018
Episode 36: The dangers of machine learning and medicine
00:22:07

Humans seem to have reached a cross-point, where they are asked to choose between functionality and privacy. But not both. Not both at all. No data, no service. That’s what companies building personal finance services say. The same applies to marketing companies, social media companies, search engine companies, and healthcare institutions.

In this episode I speak about the reasons to aggregate data for precision medicine, the consequences of such strategies and how can researchers and organizations provide services to individuals while respecting their privacy.

 

Jul 03, 2018
Episode 35: Attacking deep learning models
00:29:13

Attacking deep learning models

Compromising AI for fun and profit

 

Deep learning models have shown very promising results in computer vision and sound recognition. As more and more deep learning based systems get integrated in disparate domains, they will keep affecting the life of people. Autonomous vehicles, medical imaging and banking applications, surveillance cameras and drones, digital assistants, are only a few real applications where deep learning plays a fundamental role. A malfunction in any of these applications will affect the quality of such integrated systems and compromise the security of the individuals who directly or indirectly use them.

In this episode, we explain how machine learning models can be attacked [...]

Jun 29, 2018
Episode 34: Get ready for AI winter
00:59:04

Today I am having a conversation with Filip Piękniewski, researcher working on computer vision and AI at Koh Young Research America.
His adventure with AI started in the 90s and since then a long list of experiences at the intersection of computer science and physics, led him to the conclusion that deep learning might not be sufficient nor appropriate to solve the problem of intelligence, specifically artificial intelligence.  
I read some of his publications and got familiar with some of his ideas. Honestly, I have been attracted by the fact that Filip does not buy the hype around AI and deep learning in particular.
He doesn’t seem to share the [...]

Jun 22, 2018
Episode 33: Decentralized Machine Learning and the proof-of-train
00:17:40

In the attempt of democratizing machine learning, data scientists should have the possibility to train their models on data they do not necessarily own, nor see. A model that is privately trained should be verified and uniquely identified across its entire life cycle, from its random initialization to setting the optimal values of its parameters.
How does blockchain allow all this? Fitchain is the decentralized machine learning platform that provides models an identity and a certification of their training procedure, the proof-of-train

Jun 11, 2018
Episode 32: I am back. I have been building fitchain
00:23:14

I know, I have been away too long without publishing much in the last 3 months.
But, there’s a reason for that. I have been building a platform that combines machine learning with blockchain technology.
Let me introduce you to fitchain and tell you more in this episode.

If you want to collaborate on the project or just think it’s interesting, drop me a line on the contact page at fitchain.io

Jun 04, 2018
Founder Interview – Francesco Gadaleta of Fitchain
00:31:04

Cross-posting from Cryptoradio.io

Overview

Francesco Gadaleta introduces Fitchain, a decentralized machine learning platform that combines blockchain technology and AI to solve the data manipulation problem in restrictive environments such as healthcare or financial institutions.Francesco Gadaleta is the founder of Fitchain.io and senior advisor to Abe AI. Fitchain is a platform that officially started in October [...]

May 24, 2018
Episode 31: The End of Privacy
00:39:03

Data is a complex topic, not only related to machine learning algorithms, but also and especially to privacy and security of individuals, the same individuals who create such data just by using the many mobile apps and services that characterize their digital life.

In this episode I am together with B.J.n Mendelson, author of “Social Media is Bullshit” from St. Martin’s Press and world-renowned speaker on issues involving the myths and realities involving today’s Internet platforms.  B.J. has a new a book about privacy and sent me a free copy of “Privacy, and how to get it back” that I read in just one day. That was enough to realise how much we have in common when it comes to data and data collection.

 

Apr 02, 2018
Episode 30: Neural networks and genetic evolution: an unfeasible approach
00:22:19

Despite what researchers claim about genetic evolution, in this episode we give a realistic view of the field.

Nov 21, 2017
Episode 29: Fail your AI company in 9 steps
00:14:27

In order to succeed with artificial intelligence, it is better to know how to fail first. It is easier than you think.
Here are 9 easy steps to fail your AI startup.

Nov 11, 2017
Episode 28: Towards Artificial General Intelligence: preliminary talk
00:20:34

The enthusiasm for artificial intelligence is raising some concerns especially with respect to some ventured conclusions about what AI can really do and what its direct descendent, artificial general intelligence would be capable of doing in the immediate future. From stealing jobs, to exterminating the entire human race, the creativity (of some) seems to have no limits. 
In this episode I make sure that everyone comes back to reality - which might sound less exciting than Hollywood but definitely more… real. 

Nov 04, 2017
Episode 27: Techstars accelerator and the culture of fireflies
00:17:42

In the aftermath of the Barclays Accelerator, powered by Techstars experience, one of the most innovative and influential startup accelerators in the world, I’d like to give back to the community lessons learned, including the need for confidence, soft-skills, and efficiency, to be applied to startups that deal with artificial intelligence and data science.
In this episode I also share some thoughts about the culture of fireflies in modern and dynamic organisations.

Oct 30, 2017
Episode 26: Deep Learning and Alzheimer
00:54:02

In this episode I speak about Deep Learning technology applied to Alzheimer disorder prediction. I had a great chat with Saman Sarraf, machine learning engineer at Konica Minolta, former lab manager at the Rotman Research Institute at Baycrest, University of Toronto and author of DeepAD: Alzheimer′ s Disease Classification via Deep Convolutional Neural Networks using MRI and fMRI.

I hope you enjoy the show.

Oct 23, 2017
Episode 25: How to become data scientist [RB]
00:16:16

In this episode, I speak about the requirements and the skills to become data scientist and join an amazing community that is changing the world with data analyticsa

Oct 16, 2017
Episode 24: How to handle imbalanced datasets
00:21:21

In machine learning and data science in general it is very common to deal at some point with imbalanced datasets and class distributions. This is the typical case where the number of observations that belong to one class is significantly lower than those belonging to the other classes.  Actually this happens all the time, in several domains, from finance, to healthcare to social media, just to name a few I have personally worked with.
Think about a bank detecting fraudulent transactions among millions or billions of daily operations, or equivalently in healthcare for the identification of rare disorders.
In genetics but also with clinical lab tests this is a normal scenario, in which, fortunately there are very few patients affected by a disorder and therefore very [...]

Oct 08, 2017
Episode 23: Why do ensemble methods work?
00:18:59

Ensemble methods have been designed to improve the performance of the single model, when the single model is not very accurate. According to the general definition of ensembling, it consists in building a number of single classifiers and then combining or aggregating their predictions into one classifier that is usually stronger than the single one.

The key idea behind ensembling is that some models will do well when they model certain aspects of the data while others will do well in modelling other aspects.
In this episode I show with a numeric example why and when ensemble methods work.

Oct 03, 2017
Episode 22: Parallelising and distributing Deep Learning
00:19:42

Continuing the discussion of the last two episodes, there is one more aspect of deep learning that I would love to consider and therefore left as a full episode, that is parallelising and distributing deep learning on relatively large clusters.

As a matter of fact, computing architectures are changing in a way that is encouraging parallelism more than ever before. And deep learning is no exception and despite the greatest improvements with commodity GPUs - graphical processing units, when it comes to speed, there is still room for improvement.

Together with the last two episodes, this one completes the picture of deep learning at scale. Indeed, as I mentioned in the previous episode, How to master optimisation in deep learning, the function op [...]

Sep 25, 2017
Episode 21: Additional optimisation strategies for deep learning
00:15:08

In the last episode How to master optimisation in deep learning I explained some of the most challenging tasks of deep learning and some methodologies and algorithms to improve the speed of convergence of a minimisation method for deep learning.
I explored the family of gradient descent methods - even though not exhaustively - giving a list of approaches that deep learning researchers are considering for different scenarios. Every method has its own benefits and drawbacks, pretty much depending on the type of data, and data sparsity. But there is one method that seems to be, at least empirically, the best approach so far.

Feel free to listen to the previous episode,

Sep 18, 2017
Episode 20: How to master optimisation in deep learning
00:19:29

The secret behind deep learning is not really a secret. It is function optimisation. What a neural network essentially does, is optimising a function. In this episode I illustrate a number of optimisation methods and explain which one is the best and why.

Aug 28, 2017
Episode 19: How to completely change your data analytics strategy with deep learning
00:15:56

Over the past few years, neural networks have re-emerged as powerful machine-learning models, reaching state-of-the-art results in several fields like image recognition and speech processing. More recently, neural network models started to be applied also to textual data in order to deal with natural language, and there too with promising results. In this episode I explain why is deep learning performing the way it does, and what are some of the most tedious causes of failure.

Aug 09, 2017
Episode 18: Machines that learn like humans
00:42:06

Artificial Intelligence allow machines to learn patterns from data. The way humans learn however is different and more efficient. With Lifelong Machine Learning, machines can learn the way human beings do, faster, and more efficiently

Mar 28, 2017
Episode 17: Protecting privacy and confidentiality in data and communications
00:17:31

Talking about security of communication and privacy is never enough, especially when political instabilities are driving leaders towards decisions that will affect people on a global scale

Feb 15, 2017
Episode 16: 2017 Predictions in Data Science
00:20:31

We strongly believe 2017 will be a very interesting year for data science and artificial intelligence. Let me tell you what I expect and why.

Dec 23, 2016
Episode 15: Statistical analysis of phenomena that smell like chaos
00:10:14

Is the market really predictable? How do stock prices increase? What is their dynamics? Here is what I think about the magics and the reality of predictions applied to markets and the stock exchange.

Dec 05, 2016
Episode 14: The minimum required by a data scientist
00:16:46

Why the job of the data scientist can disappear soon. What is required by a data scientist to survive inflation.

Sep 27, 2016
Episode 13: Data Science and Fraud Detection at iZettle
00:16:32

Data science is making the difference also in fraud detection. In this episode I have a conversation with an expert in the field, Engineer Eyad Sibai, who works at iZettle, a fraud detection company

Sep 06, 2016
Episode 12: EU Regulations and the rise of Data Hijackers
00:16:17

Extracting knowledge from large datasets with large number of variables is always tricky. Dimensionality reduction helps in analyzing high dimensional data, still maintaining most of the information hidden behind complexity. Here are some methods that you must try before further analysis (Part 1).

Jul 26, 2016
Episode 11: Representative Subsets For Big Data Learning
00:21:25

How would you perform accurate classification on a very large dataset by just looking at a sample of it

May 03, 2016