Home
Search results “Data mining ensemble classifiers gold”
17. Learning: Boosting
 
51:40
MIT 6.034 Artificial Intelligence, Fall 2010 View the complete course: http://ocw.mit.edu/6-034F10 Instructor: Patrick Winston Can multiple weak classifiers be used to make a strong one? We examine the boosting algorithm, which adjusts the weight of each classifier, and work through the math. We end with how boosting doesn't seem to overfit, and mention some applications. License: Creative Commons BY-NC-SA More information at http://ocw.mit.edu/terms More courses at http://ocw.mit.edu
Views: 149990 MIT OpenCourseWare
Marios Michailidis: How to become a Kaggle #1: An introduction to model stacking
 
54:25
Ever wondered how Kaggle masters combine hundreds of different machine learning models to win modelling competitions? Ever wondered how to become ranked #1 on Kaggle? StackNet has helped me do that! StackNet is an open-source, scalable and automated meta-modelling framework that combines various supervised models to improve performance. Written in Java, this library automates many of the laborious aspects of building stacking models, so that you can focus on the important parts and move higher up the Kaggle leaderbo ards. I will explain some of the considerations for running StackNet and show how I have used it to win Kaggle competitions and generate value for dunnhumby.
Views: 16590 Data Science Festival
Microsoft Excel Data Mining: Classification
 
06:49
Microsoft Excel Data Mining: Classification. For more visit here: www.dataminingtools.net
Excel at Data Mining – Creating and Reading a Classification Matrix
 
05:30
In this video, Billy Decker of StatSlice Systems shows you how to create and read a Classification Matrix in 5 minutes with the Microsoft Excel data mining add-in*. In this example, we will create a Classification Matrix based on a mining structure with all of its associated models that we have created previously. For the example, we will be using a tutorial spreadsheet that can be found on Codeplex at: https://dataminingaddins.codeplex.com/releases/view/87029 You will also need to attach the AdventureworksDW2012 data file to SQL Server which can be downloaded here: http://msftdbprodsamples.codeplex.com/releases/view/55330 *This tutorial assumes that you have already installed the data mining add-in for Excel and configured the add-in to be pointed at an instance of SQL Server with Analysis Services to which you have access rights.
Views: 3622 StatSlice Systems
Creating Data Mining Structures & Predictive Models using the Excel Add-In  for SQL Server 2008
 
12:12
A demonstration of how to create Data Mining Structures & Predictive Models using the Excel Data mining Addin for SQL Server 2008. A data mining structure is created first and then a Microsoft Decision Tree & Neural Network are created. In the subsequent video I will create a lift chart (also known as an Accuracy Chart) to compare the effectiveness of the two models. The raw data used in the demonstration is available at http://www.analyticsinaction.com/creating-data-mining-structures-predictive-models-using-the-excel-add-in-for-sql-server-2008/ I also have a comprehensive 60 minute T-SQL course available at Udemy : https://www.udemy.com/t-sql-for-data-analysts/?couponCode=ANALYTICS50%25OFF
Views: 25817 Steve Fox
S2E2 of 5 Minutes with Ingo: Ensemble Methods
 
06:32
This week, Ingo covers Ensemble Methods. Ingo discusses the Wisdom of the Crowds philosophy that the opinions of many are more intelligent than the opinion of an individual to explain ensemble methods. He details how we can use the error correcting output methods of bagging and boosting to get better results.
Views: 659 RapidMiner, Inc.
Machine Learning for Encrypted Malware Traffic Classification
 
02:36
Machine Learning for Encrypted Malware Traffic Classification: Accounting for Noisy Labels and Non-Stationarity Blake Anderson (Cisco Systems, Inc.) David McGrew (Cisco Systems, Inc.) The application of machine learning for the detection of malicious network traffic has been well researched over the past several decades; it is particularly appealing when the traffic is encrypted because traditional pattern-matching approaches cannot be used. Unfortunately, the promise of machine learning has been slow to materialize in the network security domain. In this paper, we highlight two primary reasons why this is the case: inaccurate ground truth and a highly non-stationary data distribution. To demonstrate and understand the effect that these pitfalls have on popular machine learning algorithms, we design and carry out experiments that show how six common algorithms perform when confronted with real network data. With our experimental results, we identify the situations in which certain classes of algorithms underperform on the task of encrypted malware traffic classification. We offer concrete recommendations for practitioners given the real-world constraints outlined. From an algorithmic perspective, we find that the random forest ensemble method outperformed competing methods. More importantly, feature engineering was decisive; we found that iterating on the initial feature set, and including features suggested by domain experts, had a much greater impact on the performance of the classification system. For example, linear regression using the more expressive feature set easily outperformed the random forest method using a standard network traffic representation on all criteria considered. Our analysis is based on millions of TLS encrypted sessions collected over 12 months from a commercial malware sandbox and two geographically distinct, large enterprise networks. More on http://www.kdd.org/kdd2017/
Views: 937 KDD2017 video
Build a TensorFlow Image Classifier in 5 Min
 
05:47
In this episode we're going to train our own image classifier to detect Darth Vader images. The code for this repository is here: https://github.com/llSourcell/tensorflow_image_classifier I created a Slack channel for us, sign up here: https://wizards.herokuapp.com/ The Challenge: The challenge for this episode is to create your own Image Classifier that would be a useful tool for scientists. Just post a clone of this repo that includes your retrained Inception Model (label it output_graph.pb). If it's too big for GitHub, just upload it to DropBox and post the link in your GitHub README. I'm going to judge all of them and the winner gets a shoutout from me in a future video, as well as a signed copy of my book 'Decentralized Applications'. This CodeLab by Google is super useful in learning this stuff: https://codelabs.developers.google.com/codelabs/tensorflow-for-poets/?utm_campaign=chrome_series_machinelearning_063016&utm_source=gdev&utm_medium=yt-desc#0 This Tutorial by Google is also very useful: https://www.tensorflow.org/versions/r0.9/how_tos/image_retraining/index.html This is a good informational video: https://www.youtube.com/watch?v=VpDonQAKtE4 Really deep dive video on CNNs: https://www.youtube.com/watch?v=FmpDIaiMIeA I love you guys! Thanks for watching my videos and if you've found any of them useful I'd love your support on Patreon: https://www.patreon.com/user?u=3191693 Much more to come so please SUBSCRIBE, LIKE, and COMMENT! :) edit: Credit to Clarifai for the first conv net diagram in the video Follow me: Twitter: https://twitter.com/sirajraval Facebook: https://www.facebook.com/sirajology Instagram: https://www.instagram.com/sirajraval/ Instagram: https://www.instagram.com/sirajraval/
Views: 540470 Siraj Raval
Coursera: how to win a data science competition – Дмитрий Ульянов
 
17:37
Дмитрий Ульянов рассказывает про курс How to win a data Science competition на Coursera, где вместе с другими известными и успешными датасайентистами учит особенностям решения соревнований по анализу данных. Узнать о текущих соревнованиях можно на сайте http://mltrainings.ru/ Узнать о новых тренировках и видео можно из групп: ВКонтакте https://vk.com/mltrainings Facebook https://www.facebook.com/groups/1413405125598651/
RapidMiner 5 Tutorial - Video 10 - Feature Selection
 
03:23
Vancouver Data Blog http://vancouverdata.blogspot.com/
Views: 15894 el chief
How to Successfully Harness AI to Combat Fraud and Abuse - RSA 2018
 
34:18
Slides and blog posts available at https://elie.net/ai This talk explains why artificial intelligence (AI) is the key to building anti-abuse defenses that keep up with user expectations and combat increasingly sophisticated attacks. It covers the top 10 anti-abuse specific challenges encountered while applying AI to abuse fighting, and how to overcome them. This video is a re-recording of the talk I gave at RSA 2018 on the subject
Views: 4736 Elie Bursztein
Kaggle Cdiscount’s Image Classification Challenge — Pavel Ostyakov, Alexey Kharlamov
 
46:05
Pavel Ostyakov and Alexey Kharlamov share their solution of Kaggle Cdiscount’s Image Classification Challenge. In this competition, Kagglers were challenged to build a model that classifies the products based on their images. From this video you will learn: - How to decide which architectures to use - How to train networks faster - Problem with training second layer of classifiers - Errors while solving the problem - Ideas of other teams: using several images of product, ensembling and using kNN Павел Остяков и Алексей Харламов рассказывают про задачу классификации товаров по изображениям (Kaggle Cdiscount’s Image Classification Challenge). Павел и Алексей вместе со своей командой заняли в соревновании 5 место. Из видео вы сможете узнать: - Как принимается решение, какие архитектуры использовать - Способы ускорить обучение сетей - Сложности построения второго слоя классификаторов и способ решения - Ошибки, допущенные в процессе решения - Идеи других участников: использование нескольких изображений товара, ансамблирование и kNN Yandex hosts biweekly training sessions on machine learning. These meetings offer an opportunity for the participants of data analysis contests to meet, talk, and exchange experience. Each of these events is made up of a practical session and a report. The problems are taken from Kaggle and similar platforms. The reports are given by successful participants of recent contests, who share their strategies and talk about the techniques used by their competitors. On Dec. 9, we looked at Porto Seguro’s Safe Driver Prediction challenge on Kaggle.
Libor Mořkovský - Recognizing Malware (Machine Learning Prague 2016)
 
24:27
Recognizing Malware www.mlprague.com Slides: http://www.slideshare.net/mlprague/libor-mokovsk-recognizing-malware
Views: 170 Jiří Materna
Kaggle Competition "Home Depot Product Search Relevance"
 
29:04
Contributed by Amy(Yujing) Ma, Brett Amdur, Christopher Redino. They enrolled in the NYC Data Science Academy 12 week full time Data Science Bootcamp program taking place between January 11th to April 1st, 2016. This post is based on their machine learning project (due on the 8th week of the program). Kaggle Competition "Home Depot Product Search Relevance": https://www.kaggle.com/c/home-depot-product-search-relevance Given only raw text as input, our goal is to predict the relevancy of products to search results at the Home Depot website. Our strategy is a little different from most other teams in this Kaggle competition, where we generated a workflow that starts with text cleaning, passes through feature engineering and ends with model selection and parameter tuning in the attempt to stand out among thousands of competitors. Learn more: http://blog.nycdatascience.com/student-works/improving-home-depot-search-relevance/
Charles Martin: Can Machine Learning Apply to Musical Ensembles?
 
01:04
Part of the CHI 2016 Human Centred Machine Learning Workshop. The full paper is here: Can Machine-Learning Apply to Musical Ensembles? Charles Martin and Henry Gardner http://www.doc.gold.ac.uk/~mas02mg/HCML2016/HCML2016_paper_5.pdf
Views: 45 Marco Gillies
Kaggle Carvana Image Masking: определение фона на изображениях автомобилей — Сергей Мушинский
 
23:17
Сергей Мушинский рассказывает про задачу определение фона на изображениях автомобилей (Kaggle Carvana Image Masking Challenge). Сергей вместе со своей командой занял в соревновании 4 место. Из видео вы сможете узнать: - Использование псевдоразметки изображений нейронными сетями - Работа в команде в условиях общих ограниченных вычислительных ресурсов - Бывает ли полезно вручную доразмечать объекты - Подходы других участников: от двух сетей без усреднения до сложных ансамблей с разнообразными архитектурами Слайды: https://gh.mltrainings.ru/presentations/Mushinskiy_KaggleCarvanaImageMasking%20Challenge_2017.pdf Узнать о текущих соревнованиях можно на сайте http://mltrainings.ru/ Узнать о новых тренировках и видео можно из групп: ВКонтакте https://vk.com/mltrainings Facebook https://www.facebook.com/groups/1413405125598651/
Tech Talk: Teach JS Aesthetics with Machine Learning
 
51:24
Jonathan Martin will give you a whirlwind tour of the fundamental concepts and algorithms in machine learning, then explore a front-end application: selecting the "best" photos to feature on our photo sharing site. Don't expect mathematically laborious derivations of SVM kernels or the infinite VC dimension of Neural Nets, but we will gain enough intuition to make informed compromises (thanks to the No Free Lunch theorem, everything is a compromise) in our pursuit of aesthetically-intelligent machines. Find Jonathan on Twitter: @nybblr http://www.bignerdranch.com http://www.twitter.com/bignerdranch http://www.facebook.com/bignerdranch
Views: 83 Big Nerd Ranch
TTIC Distinguished Lecture Series - Geoffrey Hinton
 
01:08:08
Title: Dark Knowledge Abstract: A simple way to improve classification performance is to average the predictions of a large ensemble of different classifiers. This is great for winning competitions but requires too much computation at test time for practical applications such as speech recognition. In a widely ignored paper in 2006, Caruana and his collaborators showed that the knowledge in the ensemble could be transferred to a single, efficient model by training the single model to mimic the log probabilities of the ensemble average. This technique works because most of the knowledge in the learned ensemble is in the relative probabilities of extremely improbable wrong answers. For example, the ensemble may give a BMW a probability of one in a billion of being a garbage truck but this is still far greater (in the log domain) than its probability of being a carrot. This "dark knowledge", which is practically invisible in the class probabilities, defines a similarity metric over the classes that makes it much easier to learn a good classifier. I will describe a new variation of this technique called "distillation" and will show some surprising examples in which good classifiers over all of the classes can be learned from data in which some of the classes are entirely absent, provided the targets come from an ensemble that has been trained on all of the classes. I will also show how this technique can be used to improve a state-of-the-art acoustic model and will discuss its application to learning large sets of specialist models without overfitting. This is joint work with Oriol Vinyals and Jeff Dean. Bio: Geoffrey Hinton received his BA in experimental psychology from Cambridge in 1970 and his PhD in Artificial Intelligence from Edinburgh in 1978. He did postdoctoral work at Sussex University and the University of California San Diego and spent five years as a faculty member in the Computer Science department at Carnegie-Mellon University. He then became a fellow of the Canadian Institute for Advanced Research and moved to the Department of Computer Science at the University of Toronto. He spent three years from 1998 until 2001 setting up the Gatsby Computational Neuroscience Unit at University College London and then returned to the University of Toronto where he is a University Professor. He is the director of the program on "Neural Computation and Adaptive Perception" which is funded by the Canadian Institute for Advanced Research. Geoffrey Hinton is a fellow of the Royal Society, the Royal Society of Canada, and the Association for the Advancement of Artificial Intelligence. He is an honorary foreign member of the American Academy of Arts and Sciences, and a former president of the Cognitive Science Society. He has received honorary doctorates from the University of Edinburgh and the University of Sussex. He was awarded the first David E. Rumelhart prize (2001), the IJCAI award for research excellence (2005), the IEEE Neural Network Pioneer award (1998), the ITAC/NSERC award for contributions to information technology (1992) the Killam prize for Engineering (2012) and the NSERC Herzberg Gold Medal (2010) which is Canada's top award in Science and Engineering. Geoffrey Hinton designs machine learning algorithms. His aim is to discover a learning procedure that is efficient at finding complex structure in large, high-dimensional datasets and to show that this is how the brain learns to see. He was one of the researchers who introduced the back-propagation algorithm that has been widely used for practical applications. His other contributions to neural network research include Boltzmann machines, distributed representations, time-delay neural nets, mixtures of experts, variational learning, products of experts and deep belief nets. His current main interest is in unsupervised learning procedures for multi-layer neural networks with rich sensory input.
Views: 24715 TTIC
Kaggle Camera Model Identification (1-2 places) — Artur Fattakhov, Ilya Kibardin, Dmitriy Abulkhanov
 
37:29
Artur Fattakhov, Ilya Kibardin and Dmitriy Abulkhanov share their winner’s solutions of Kaggle Camera Model Identification. In this competition, Kagglers challenged to build an algorithm that identifies which camera model captured an image by using traces intrinsically left in the image. From this video you will learn: - How to get additional photo data - Training Scheme with cyclic learning rate and pseudo labeling - Snapshot Ensembles aka Multi Checkpoint TTA - Training on small crops and finetune on big crops to speed up without loss in quality - Prediction equalization Slides: https://gh.mltrainings.ru/presentations/KibardinFattahovAbulkhanov_KaggleCamera_2018.pdf Github: https://github.com/ikibardin/kaggle-camera-model-identification Yandex hosts biweekly training sessions on machine learning. These meetings offer an opportunity for the participants of data analysis contests to meet, talk, and exchange experience. Each of these events is made up of a practical session and a report. The problems are taken from Kaggle and similar platforms. The reports are given by successful participants of recent contests, who share their strategies and talk about the techniques used by their competitors.
A Closer Look at KNN Solutions
 
02:00
This video is part of the Udacity course "Machine Learning for Trading". Watch the full course at https://www.udacity.com/course/ud501
Views: 1951 Udacity
9 RapidMiner - Ensemble (Majority Vote)
 
21:45
Ensemble method เป็นเทคนิคที่ใช้ learning algorithms หลายตัวทำงานร่วมกัน เพื่อให้ได้ค่า predicted performance ที่สูงขึ้น คลิปนี้จะอธิบาย 1 ในหลายวิธีใน Ensemble method คือ Majority vote โดยใช้ RapidMiner
Views: 210 Kanda
8.1: Introduction to Kaggle, Deep Learning and TensorFlow (Module 8, Part 1)
 
15:09
Introduction to Kaggle with neural networks. Kaggle ha a variety of datasets that can be useful for machine learning research. This video is part of a course that is taught in a hybrid format at Washington University in St. Louis; however, all the information is online and you can easily follow along. T81-558: Application of Deep Learning, at Washington University in St. Louis Please subscribe and comment! Follow me: YouTube: https://www.youtube.com/user/HeatonResearch Twitter: https://twitter.com/jeffheaton GitHub: https://github.com/jeffheaton More links: Complete course: https://sites.wustl.edu/jeffheaton/t81-558/ Complete playlist: https://www.youtube.com/playlist?list=PLjy4p-07OYzulelvJ5KVaT2pDlxivl_BN
Views: 820 Jeff Heaton
Ian Ozsvald, Dr Gusztav Belteki & Giles Weaver - Machine learning with ventilator data
 
41:35
Filmed at PyData London 2017 Description Mechanical ventilators are widely used in intensive care, they are sophisticated but Doctors do not have time to analyse the copious traces of data in a neonatal unit. We are providing an easy-to-interpret summary of this time-series data using visualisation and machine learning. This is an open source collaboration with the NHS, All results are open. Abstract Mechanical ventilators are widely used in intensive care. Even two decades ago they were be primarily mechanical devices whose "only" task was to inflate the patient’s lung. Recently, however, they have become equipped with powerful computers that provide sophisticated ventilator modes. Data provided by the ventilators are almost never downloaded, stored or analysed. The data is complex, high frequency and requires time-intensive scrutiny to review. Doctors do not have time to analyse these traces in a neonatal unit. We are providing a simple and easy-to-interpret summary of 100Hz dual-channel ventilator data to improve the quality of care of young infants by time-poor staff. This involves signal processing, visualisation, building a gold standard and machine learning to segment breaths and summarise a baby's behaviour. This builds on our talk at PyDataLondon Meetup 30 in January 2017. Our goal is to open source the research so that others can benefit from the processes that we develop. We invite feedback from the audience to help improve our methods. Anyone interested in time series data, automated labeling, scikit-learn, Bokeh and medical applications will find this talk of interest. Both the highs and lows of our current approaches will be discussed. This is a collaboration between Dr Gusztav Belteki (Cambridge University Hopsitals NHS Foundation Turst), Ian Ozsvald (ModelInsight) and Giles Weaver (ModelInsight). www.pydata.org PyData is an educational program of NumFOCUS, a 501(c)3 non-profit organization in the United States. PyData provides a forum for the international community of users and developers of data analysis tools to share ideas and learn from each other. The global PyData network promotes discussion of best practices, new approaches, and emerging technologies for data management, processing, analytics, and visualization. PyData communities approach data science using many languages, including (but not limited to) Python, Julia, and R. We aim to be an accessible, community-driven conference, with novice to advanced level presentations. PyData tutorials and talks bring attendees the latest project features along with cutting-edge use cases.
Views: 1165 PyData
The Spider's Web: Britain's Second Empire (Documentary)
 
01:18:02
At the demise of empire, City of London financial interests created a web of secrecy jurisdictions that captured wealth from across the globe and hid it in a web of offshore islands. Today, up to half of global offshore wealth is hidden in British jurisdictions and Britain and its dependencies are the largest global players in the world of international finance. Sponsor the next film on Patreon: https://www.patreon.com/independentdocumentary Share this documentary with your friends, and ask sites to feature it: https://twitter.com/spiderswebfilm https://www.facebook.com/Spiderswebfilm/ https://www.imdb.com/title/tt6483026/ Translate this documentary here on youtube or contact us for the .srt file [email protected] For those interested to learn more about tax justice and financial secrecy, read about the Tax Justice Network's campaigning and regular blogs - become part of the movement for change and listen to the Tax Justice Network's monthly podcast/radio show the Taxcast https://www.taxjustice.net/taxcast/ Review on Open Democracy: https://www.opendemocracy.net/neweconomics/film-review-spiders-web-britains-second-empire/ Review on Filmotomy: https://filmotomy.com/the-spiders-web-britains-second-empire/ Website: www.spiderswebfilm.com Spanish Version: https://www.youtube.com/watch?v=85dsTnbhchc French Version: https://www.youtube.com/watch?v=hizj_6EH34M Italian Version: https://www.youtube.com/watch?v=VwmvXLamkto&t=1s Subtitles: French, Spanish, German, Italian, Russian, Arabic, Korean, Hungarian, English, Turkish.
Views: 368930 Independent POV
HOW TO BREAK A CAPTCHA | INTRODUCTION TO AI
 
26:14
GOVANIFY'S LAB #2 AI is always viewed as a huge black box, as something even creators don't understand and you are always shown the neuronal network part of the story, but what about showing to you the complete story about AI? As always in this series of video, through a silly project, I'm explaining to you how AI works and how state-of-the-art object detection works. I am also giving you resources to make you able to research on your own and learn more about the subject, by trying to make this as useful in passive and active learning once again! I've tried to improve as best as I could both the audio quality and video quality, even though the editing is still sort of lazy imho. I take all sort of criticisms and help requests about this, so feel free to contact me! I sincerely hope you liked this video! -G ~~~~~ Resources: TensorFlow: http://tensorflow.org/ OpenAI: https://openai.com Twitter: https://twitter.com/GovanifY Blog: https://govanify.com Mail: [email protected] (PGP available on the "About" page on my website) Video made using KdenLive and Audacity on Gentoo Linux Microphone: AT2035 XLR Audio Interface: Focusrite Scarlett 2i2 Camera: The Frankenstein phone(aka what happens when police breaks your phone and you repair the screen as you can, OnePlus 3t) Sources: Background Music: Hopeless Romantic, FearofDark 2-Minute Neuroscience - Medulla Oblongata, Neuroscientifically Challenged 3D Animation - Brain with Neurons Firing, TomsTwinFilms 4K Mask RCNN COCO Object detection and segmentation #2, Karol Majek Baxter the Robot playing Tic-tac-toe Game 4, Michael Overstreet But what _is_ a Neural Network _ Chapter 1, deep learning, 3Blue1Brown Cleverbot vs Humans, Humor Vault FUNNIEST CAPTCHA FAILS, Random GO 2015 SPL Finals - Nao-Team HTWK vs. B-Human 1st half, HTWK Robots Google's DeepMind AI Just Taught Itself To Walk, Tech Insider Hot Robot At SXSW Says She Wants To Destroy Humans _ The Pulse _ CNBC, CNBC How to use Tensorboard Embeddings Visualization with MNIST, Anuj shah MariFlow gets Gold in 50cc Mushroom Cup, SethBling Mind reading with brain scanners _ John-Dylan Haynes _ TEDxBerlin, TEDx Talks RASPUTIN - Vladimir Putin - Love The Way You Move (Funk Overload) @slocband, Pace Audioo Recurrent Neural Network Visualization, Michael Stone Walking around Shibuya - Tokyo - 渋谷を歩く- 4K Ultra HD, TokyoStreetView - Japan The Beautiful Tensorflow 14 Visualization Tensorboard 1 (neural network tutorials), 周莫烦 US Military Robot Dog will make a great companion for RoboCop, ArmedForcesUpdate Visualization of newly formed synapses with unprecedented resolution, Max Planck Florida Institute for Neuroscience I don't think I've used much of other sources, if you think you should be credited or do not like to be on this video please contact me Random thanks in no particular order: : Ely (sykhro), Frafnir, Batiste Flo., Xaddgx, Vlad
Views: 536 Gravof Corp
Boston BSides - Machine Learning for Incident Detection - Chris McCubbin & David Bianco
 
43:06
Organizations today are collecting more information about what's going on in their environments than ever before, but manually sifting through all this data to find evil on your network is next to impossible. Increasingly, companies are turning to big data analytics and machine learning to detect security incidents. Most of these solutions are black-box products that cannot be easily tailored to the environments in which they run. Therefore, reliable detection of security incidents remains elusive, and there is a distinct lack of open source innovation. It doesn't have to be this way! Many security pros think nothing of whipping up a script to extract downloaded files from a PCAP, yet recoil in horror at the idea of writing their own machine learning tools. The "analytics barrier" is perceived to be very high, but getting started is much easier than you think! In this presentation, we’ll walk through the creation of a simple Python script that can learn to find malicious activity in your HTTP proxy logs. At the end of it all, you'll not only gain a useful tool to help you identify things that your IDS and SIEM might have missed, but you’ll also have the knowledge necessary to adapt that code to other uses as well. David J. Bianco is a Security Technologist at Sqrrl Data, Inc. Before coming to work as a Security Technologist and DFIR subject matter expert at Sqrrl, he led the hunt team at Mandiant, helping to develop and prototype innovative approaches to detect and respond to network attacks. Prior to that, he spent five years helping to build an intel-driven detection & response program for General Electric (GE-CIRT). He set detection strategies for a network of nearly 500 NSM sensors in over 160 countries and led response efforts for some of the company’s the most critical incidents. He stays active in the community, speaking and writing on the subjects of Incident Detection & Response, Threat Intelligence and Security Analytics. He is also a member of the MLSec Project (http://www.mlsecproject.org). You can follow him on Twitter as @DavidJBianco or subscribe to his blog, "Enterprise Detection & Response" (http://detect-respond.blogspot.com). Chris McCubbin is the Director of Data Science and a co-founder of Sqrrl Data, Inc. His primary task is prototyping new designs and algorithms to extend the capabilities of the Sqrrl Enterprise cybersecurity solution. Prior to cofounding Sqrrl, he spent 2 years developing big-data analytics for the Department of Defense at TexelTek, Inc and 10 years as Senior Professional Staff at the Johns Hopkins Applied Physics Laboratory where he applied machine learning algorithms to swarming unmanned vehicle ensembles. He holds a Masters degree in Computer Science and Bachelor’s degrees in Mathematics and Computer Science from the University of Maryland.
Views: 208 BSides Boston
Talks@12: Data Science & Medicine
 
54:49
Innovations in ways to compile, assess and act on the ever-increasing quantities of health data are changing the practice and police of medicine. Statisticians Laura Hatfield and Sherri Rose will discuss recent methodological advances and the impact of big data on human health. Speakers: Laura Hatfield, PhD Associate Professor, Department of Health Care Policy, Harvard Medical School Sherri Rose, PhD Associate Professor, Department of Health Care Policy, Harvard Medical School Like Harvard Medical School on Facebook: https://goo.gl/4dwXyZ Follow on Twitter: https://goo.gl/GbrmQM Follow on Instagram: https://goo.gl/s1w4up Follow on LinkedIn: https://goo.gl/04vRgY Website: https://hms.harvard.edu/
Finale Doshi-Velez: "A Roadmap for the Rigorous Science of Interpretability" | Talks at Google
 
54:35
With a growing interest in interpretability, there is an increasing need to characterize what exactly we mean by it and how to sensibly compare the interpretability of different approaches. In this talk, I'll start by discussing some research in interpretable machine learning from our group, and then broaden it out to discuss what interpretability is and when it is needed. I'll argue that our current desire for "interpretability" is as vague as asking for "good predictions" -- a desire that. while entirely reasonable, must be formalized into concrete needs such as high average test performance (perhaps held-out likelihood is a good metric) or some kind of robust performance (perhaps sensitivity or specificity are more appropriate metrics). This objective of this talk is to start a conversation to do the same for interpretability: I will suggest a taxonomy for interpretable models and their evaluation, and also highlight important open questions about the science of interpretability in machine learning.
Views: 2247 Talks at Google
NW-NLP 2018: Semantic Matching Against a Corpus
 
01:00:21
The fifth Pacific Northwest Regional Natural Language Processing Workshop will be held on Friday April 27, 2018, in Redmond, WA. We accepted abstracts and papers on all aspects of natural language text and speech processing, computational linguistics, and human language technologies. As with past four workshops, the goal of this one-day NW-NLP event is to provide a less-formal setting in the Pacific Northwest to present research ideas, make new acquaintances, and learn about the breadth of exciting work currently being pursued in North-West area. Morning Talks Title: Semantic Matching Against a Corpus: New Applications and Methods Speakers: Lucy Lin, Scott Miles and Noah Smith. Title: Synthetic and Natural Noise Both Break Neural Machine Translation Speakers: Yonatan Belinkov and Yonatan Bisk. Title: Syntactic Scaffolds for Semantic Structures Speakers: Swabha Swayamdipta, Sam Thomson, Kenton Lee, Luke Zettlemoyer, Chris Dyer and Noah A. Smith. See more at https://www.microsoft.com/en-us/research/video/nw-nlp-2018-semantic-matching-against-a-corpus-new-applications-and-methods-synthetic-and-natural-noise-both-break-neural-machine-translation-and-syntactic-scaffolds-for-semantic-structures/
Views: 366 Microsoft Research
NYC Open Data Meetup, R pacakge Caret workshop part1
 
52:04
NYC Open Data Meetup, R pacakge Caret workshop
Views: 796 Vivian Zhang
3   3   3 Exhaustive search attacks 20 min 003
 
01:28
Coursera Cryptography
Views: 92 Marcel van Vuure
Kent Hovind - Seminar 4 - Lies in the textbooks [MULTISUBS]
 
02:47:47
Creation Seminar 4: Lies in the Textbooks by Dr. Kent Hovind WITH SUBTITLES: Afrikaans, Bulgarian, Chinese_CS, Chinese_CT, Croatian, English, Estonian, Finnish, French, German, Hebrew, Indonesian, Latvian, Norwegian, Portuguese, Romanian, Russian, Spanish, Swedish Dr. Hovind shows how public school textbooks are permeated with fraudulent information in order to convince students that evolution is true. Topics included: the geologic column, the Grand Canyon, vestigial organs, the deception of Haeckel's embryonic research, DNA, and many more. Enjoy this point-by-point, entertaining demonstration of scientific evidence used to shed light on each of the lies still being pushed upon our culture. Learn active steps you can take to impact your public school system! No ratings enabled because truth is not based on majority opinion.
Views: 54135 Didymus Didymus