Player FM - Internet Radio Done Right
Checked 1M ago
Добавлено один год назад
Контент предоставлен Jacob Haimes. Весь контент подкастов, включая эпизоды, графику и описания подкастов, загружается и предоставляется непосредственно компанией Jacob Haimes или ее партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.
Player FM - приложение для подкастов
Работайте офлайн с приложением Player FM !
Работайте офлайн с приложением Player FM !
Into AI Safety
Отметить все как (не)прослушанные ...
Manage series 3536907
Контент предоставлен Jacob Haimes. Весь контент подкастов, включая эпизоды, графику и описания подкастов, загружается и предоставляется непосредственно компанией Jacob Haimes или ее партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.
The Into AI Safety podcast aims to make it easier for everyone, regardless of background, to get meaningfully involved with the conversations surrounding the rules and regulations which should govern the research, development, deployment, and use of the technologies encompassed by the term "artificial intelligence" or "AI" For better formatted show notes, additional resources, and more, go to https://into-ai-safety.github.io For even more content and community engagement, head over to my Patreon at https://www.patreon.com/IntoAISafety
…
continue reading
19 эпизодов
Отметить все как (не)прослушанные ...
Manage series 3536907
Контент предоставлен Jacob Haimes. Весь контент подкастов, включая эпизоды, графику и описания подкастов, загружается и предоставляется непосредственно компанией Jacob Haimes или ее партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.
The Into AI Safety podcast aims to make it easier for everyone, regardless of background, to get meaningfully involved with the conversations surrounding the rules and regulations which should govern the research, development, deployment, and use of the technologies encompassed by the term "artificial intelligence" or "AI" For better formatted show notes, additional resources, and more, go to https://into-ai-safety.github.io For even more content and community engagement, head over to my Patreon at https://www.patreon.com/IntoAISafety
…
continue reading
19 эпизодов
Все серии
×
1 INTERVIEW: Scaling Democracy w/ (Dr.) Igor Krawczuk 2:58:46
2:58:46
Прослушать Позже
Прослушать Позже
Списки
Нравится
Нравится2:58:46
The almost Dr. Igor Krawczuk joins me for what is the equivalent of 4 of my previous episodes. We get into all the classics: eugenics, capitalism, philosophical toads... Need I say more? If you're interested in connecting with Igor, head on over to his website , or check out placeholder for thesis (it isn't published yet). Because the full show notes have a whopping 115 additional links, I'll highlight some that I think are particularly worthwhile here: The best article you'll ever read on Open Source AI The best article you'll ever read on emergence in ML Kate Crawford's Atlas of AI ( Wikipedia ) On the Measure of Intelligence Thomas Piketty's Capital in the Twenty-First Century ( Wikipedia ) Yurii Nesterov's Introductory Lectures on Convex Optimization Chapters (02:32) - Introducing Igor (10:11) - Aside on EY, LW, EA, etc., a.k.a. lettersoup (18:30) - Igor on AI alignment (33:06) - "Open Source" in AI (41:20) - The story of infinite riches and suffering (59:11) - On AI threat models (01:09:25) - Representation in AI (01:15:00) - Hazard fishing (01:18:52) - Intelligence and eugenics (01:34:38) - Emergence (01:48:19) - Considering externalities (01:53:33) - The shape of an argument (02:01:39) - More eugenics (02:06:09) - I'm convinced, what now? (02:18:03) - AIxBio (round ??) (02:29:09) - On open release of models (02:40:28) - Data and copyright (02:44:09) - Scientific accessibility and bullshit (02:53:04) - Igor's point of view (02:57:20) - Outro Links Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. All references, including those only mentioned in the extended version of this episode, are included. Suspicious Machines Methodology , referred to as the "Rotterdam Lighthouse Report" in the episode LIONS Lab at EPFL The meme that Igor references On the Hardness of Learning Under Symmetries Course on the concept of equivariant deep learning Aside on EY/EA/etc. Sources on Eliezer Yudkowski Scholarly Community Encyclopedia TIME100 AI Yudkowski's personal website EY Wikipedia A Very Literary Wiki -TIME article: Pausing AI Developments Isn’t Enough. We Need to Shut it All Down documenting EY's ruminations of bombing datacenters; this comes up later in the episode but is included here because it about EY. LessWrong LW Wikipedia MIRI Coverage on Nick Bostrom (being a racist) The Guardian article: ‘Eugenics on steroids’: the toxic and contested legacy of Oxford’s Future of Humanity Institute The Guardian article: Oxford shuts down institute run by Elon Musk-backed philosopher Investigative piece on Émile Torres On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜 NY Times article: We Teach A.I. Systems Everything, Including Our Biases NY Times article: Google Researcher Says She Was Fired Over Paper Highlighting Bias in A.I. Timnit Gebru's Wikipedia The TESCREAL Bundle: Eugenics and the Promise of Utopia through Artificial General Intelligence Sources on the environmental impact of LLMs The Environmental Impact of LLMs The Cost of Inference: Running the Models Energy and Policy Considerations for Deep Learning in NLP The Carbon Impact of AI vs Search Engines Filling Gaps in Trustworthy Development of AI (Igor is an author on this one) A Computational Turn in Policy Process Studies: Coevolving Network Dynamics of Policy Change The Smoothed Possibility of Social Choice , an intro in social choice theory and how it overlaps with ML Relating to Dan Hendrycks Natural Selection Favors AIs over Humans "One easy-to-digest source to highlight what he gets wrong [is] Social and Biopolitical Dimensions of Evolutionary Thinking " -Igor Introduction to AI Safety, Ethics, and Society , recently published textbook " Source to the section [of this paper] that makes Dan one of my favs from that crowd." -Igor Twitter post referenced in the episode<...…

1 INTERVIEW: StakeOut.AI w/ Dr. Peter Park (3) 1:42:00
1:42:00
Прослушать Позже
Прослушать Позже
Списки
Нравится
Нравится1:42:00
As always, the best things come in 3s: dimensions, musketeers, pyramids, and... 3 installments of my interview with Dr. Peter Park, an AI Existential Safety Post-doctoral Fellow working with Dr. Max Tegmark at MIT. As you may have ascertained from the previous two segments of the interview, Dr. Park cofounded StakeOut.AI along with Harry Luk and one other cofounder whose name has been removed due to requirements of her current position. The non-profit had a simple but important mission: make the adoption of AI technology go well, for humanity, but unfortunately, StakeOut.AI had to dissolve in late February of 2024 because no granter would fund them. Although it certainly is disappointing that the organization is no longer functioning, all three cofounders continue to contribute positively towards improving our world in their current roles. If you would like to investigate further into Dr. Park's work, view his website , Google Scholar , or follow him on Twitter 00:00:54 ❙ Intro 00:02:41 ❙ Rapid development 00:08:25 ❙ Provable safety, safety factors, & CSAM 00:18:50 ❙ Litigation 00:23:06 ❙ Open/Closed Source 00:38:52 ❙ AIxBio 00:47:50 ❙ Scientific rigor in AI 00:56:22 ❙ AI deception 01:02:45 ❙ No takesies-backsies 01:08:22 ❙ StakeOut.AI's start 01:12:53 ❙ Sustainability & Agency 01:18:21 ❙ "I'm sold, next steps?" -you 01:23:53 ❙ Lessons from the amazing Spiderman 01:33:15 ❙ "I'm ready to switch careers, next steps?" -you 01:40:00 ❙ The most important question 01:41:11 ❙ Outro Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. StakeOut.AI Pause AI AI Governance Scorecard (go to Pg. 3) CIVITAI Article on CIVITAI and CSAM Senate Hearing: Protecting Children Online PBS Newshour Coverage The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work Open Source/Weights/Release/Interpretation Open Source Initiative History of the OSI Meta’s LLaMa 2 license is not Open Source Is Llama 2 open source? No – and perhaps we need a new definition of open… Apache License, Version 2.0 3Blue1Brown: Neural Networks Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators The online table Signal Bloomz model on HuggingFace Mistral website NASA Tragedies Challenger disaster on Wikipedia Columbia disaster on Wikipedia AIxBio Risk Dual use of artificial-intelligence-powered drug discovery Can large language models democratize access to dual-use biotechnology? Open-Sourcing Highly Capable Foundation Models (sadly, I can't rename the article...) Propaganda or Science: Open Source AI and Bioterrorism Risk Exaggerating the risks (Part 15: Biorisk from LLMs) Will releasing the weights of future large language models grant widespread access to pandemic agents? On the Societal Impact of Open Foundation Models Policy brief Apart Research Science Cicero Human-level play in the game of Diplomacy by combining language models with strategic reasoning Cicero webpage AI Deception: A Survey of Examples, Risks, and Potential Solutions Open Sourcing the AI Revolution: Framing the debate on open source, artificial intelligence and regulation AI Safety Camp Into AI Safety Patreon…

1 INTERVIEW: StakeOut.AI w/ Dr. Peter Park (2) 1:06:23
1:06:23
Прослушать Позже
Прослушать Позже
Списки
Нравится
Нравится1:06:23
Join me for round 2 with Dr. Peter Park, an AI Existential Safety Postdoctoral Fellow working with Dr. Max Tegmark at MIT. Dr. Park was a cofounder of StakeOut.AI , a non-profit focused on making AI go well for humans , along with Harry Luk and one other individual, whose name has been removed due to requirements of her current position. In addition to the normal links, I wanted to include the links to the petitions that Dr. Park mentions during the podcast. Note that the nonprofit which began these petitions, StakeOut.AI, has been dissolved. Right AI Laws, to Right Our Future: Support Artificial Intelligence Safety Regulations Now Is Deepfake Illegal? Not Yet! Ban Deepfakes to Protect Your Family & Demand Deepfake Laws Ban Superintelligence: Stop AI-Driven Human Extinction Risk 00:00:54 - Intro 00:02:34 - Battleground 1: Copyright 00:06:28 - Battleground 2: Moral Critique of AI Collaborationists 00:08:15 - Rich Sutton 00:20:41 - OpenAI Drama 00:34:28 - Battleground 3: Contract Negotiations for AI Ban Clauses 00:37:57 - Tesla, Autopilot, and FSD 00:40:02 - Recycling 00:47:40 - Battleground 4: New Laws and Policies 00:50:00 - Battleground 5: Whistleblower Protections 00:53:07 - Whistleblowing on Microsoft 00:54:43 - Andrej Karpathy & Exercises in Empathy 01:05:57 - Outro Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. StakeOut.AI The Times Sues OpenAI and Microsoft Over A.I. Use of Copyrighted Work Susman Godfrey LLP Rich Sutton Reinforcement Learning: An Introduction (textbook) AI Succession (presentation by Rich Sutton) The Alberta Plan for AI Research Moore's Law The Future of Integrated Electronics (original paper) Computer History Museum's entry on Moore's Law Stochastic gradient descent (SGD) on Wikipedia OpenAI Drama Max Read's Substack post Zvi Mowshowitz's Substack series, in order of posting OpenAI: Facts from a Weekend OpenAI: The Battle of the Board OpenAI: Altman Returns OpenAI: Leaks Confirm the Story ← best singular post in the series OpenAI: The Board Expands Official OpenAI announcement WGA on Wikipedia SAG-AFTRA on Wikipedia Tesla's False Advertising Tesla's response to the DMV's false-advertising allegations: What took so long? Tesla Tells California DMV that FSD Is Not Capable of Autonomous Driving What to Call Full Self-Driving When It Isn't Full Self-Driving? Tesla fired an employee after he posted driverless tech reviews on YouTube Tesla's page on Autopilot and Full Self-Driving Recycling Boulder County Recycling Center Stockpiles Accurately Sorted Recyclable Materials Out of sight, out of mind Boulder Eco-Cycle Recycling Guidelines Divide-and-Conquer Dynamics in AI-Driven Disempowerment Microsoft Whistleblower Whistleblowers call out AI's flaws Shane's LinkedIn post Letters sent by Jones Karpathy announces departure from OpenAI…
UPDATE: Contrary to what I say in this episode, I won't be removing any episodes that are already published from the podcast RSS feed. After getting some advice and reflecting more on my own personal goals, I have decided to shift the direction of the podcast towards accessible content regarding "AI" instead of the show's original focus. I will still be releasing what I am calling research ride-along content to my Patreon , but the show's feed will consist only of content that I aim to make as accessible as possible. 00:35 - TL;DL 01:12 - Advice from Pete 03:10 - My personal goal 05:39 - Reflection on refining my goal 09:08 - Looking forward (logistics…
Dr. Peter Park is an AI Existential Safety Postdoctoral Fellow working with Dr. Max Tegmark at MIT. In conjunction with Harry Luk and one other cofounder, he founded StakeOut.AI , a non-profit focused on making AI go well for humans . 00:54 - Intro 03:15 - Dr. Park, x-risk, and AGI 08:55 - StakeOut.AI 12:05 - Governance scorecard 19:34 - Hollywood webinar 22:02 - Regulations.gov comments 23:48 - Open letters 26:15 - EU AI Act 35:07 - Effective accelerationism 40:50 - Divide and conquer dynamics 45:40 - AI "art" 53:09 - Outro Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. StakeOut.AI AI Governance Scorecard (go to Pg. 3) Pause AI Regulations.gov USCO StakeOut.AI Comment OMB StakeOut.AI Comment AI Treaty open letter TAISC Alpaca: A Strong, Replicable Instruction-Following Model References on EU AI Act and Cedric O Tweet from Cedric O EU policymakers enter the last mile for Artificial Intelligence rulebook AI Act: EU Parliament’s legal office gives damning opinion on high-risk classification ‘filters’ EU’s AI Act negotiations hit the brakes over foundation models The EU AI Act needs Foundation Model Regulation BigTech’s Efforts to Derail the AI Act Open Sourcing the AI Revolution: Framing the debate on open source, artificial intelligence and regulation Divide-and-Conquer Dynamics in AI-Driven Disempowerment…
Take a trip with me through the paper Large Language Models, A Survey , published on February 9th of 2024. All figures and tables mentioned throughout the episode can be found on the Into AI Safety podcast website . 00:36 - Intro and authors 01:50 - My takes and paper structure 04:40 - Getting to LLMs 07:27 - Defining LLMs & emergence 12:12 - Overview of PLMs 15:00 - How LLMs are built 18:52 - Limitations if LLMs 23:06 - Uses of LLMs 25:16 - Evaluations and Benchmarks 28:11 - Challenges and future directions 29:21 - Recap & outro Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. Large Language Models, A Survey Meysam's LinkedIn Post Claude E. Shannon A symbolic analysis of relay and switching circuits (Master's Thesis) Communication theory of secrecy systems A mathematical theory of communication Prediction and entropy of printed English Future ML Systems Will Be Qualitatively Different More Is Different Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training Are Emergent Abilities of Large Language Models a Mirage? Are Emergent Abilities of Large Language Models just In-Context Learning? Attention is all you need Direct Preference Optimization: Your Language Model is Secretly a Reward Model KTO: Model Alignment as Prospect Theoretic Optimization Optimization by Simulated Annealing Memory and new controls for ChatGPT Hallucinations and related concepts—their conceptual background…
Esben reviews an application that I would soon submit for Open Philanthropy's Career Transitition Funding opportunity. Although I didn't end up receiving the funding, I do think that this episode can be a valuable resource for both others and myself when applying for funding in the future. Head over to Apart Research's website to check out their work, or the Alignment Jam website for information on upcoming hackathons. A doc-capsule of the application at the time of this recording can be found at this link . 01:38 - Interview starts 05:41 - Proposal 11:00 - Personal statement 14:00 - Budget 21:12 - CV 22:45 - Application questions 34:06 - Funding questions 44:25 - Outro Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. AI governance talent profiles we’d like to see The AI Governance Research Sprint Reasoning Transparency Places to look for funding Open Philanthropy's Career development and transition funding Long-Term Future Fund Manifund…
Before I begin with the paper-distillation based minisodes, I figured we would go over best practices for reading research papers. I go through the anatomy of typical papers, and some generally applicable advice. 00:56 - Anatomy of a paper 02:38 - Most common advice 05:24 - Reading sparsity and path 07:30 - Notes and motivation Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. Ten simple rules for reading a scientific paper Best sources I found Let's get critical: Reading academic articles #GradHacks: A guide to reading research papers How to read a scientific paper (presentation) Some more sources How to read a scientific article How to read a research paper Reading a scientific article…
Join our hackathon group for the second episode in the Evals November 2023 Hackathon subseries. In this episode, we solidify our goals for the hackathon after some preliminary experimentation and ideation. Check out Stellaric's website , or follow them on Twitter . 01:53 - Meeting starts 05:05 - Pitch: extension of locked models 23:23 - Pitch: retroactive holdout datasets 34:04 - Preliminary results 37:44 - Next steps 42:55 - Recap Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. Evalugator library Password Locked Model blogpost TruthfulQA: Measuring How Models Mimic Human Falsehoods BLEU: a Method for Automatic Evaluation of Machine Translation BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions Detecting Pretraining Data from Large Language Models…
I provide my thoughts and recommendations regarding personal professional portfolios. 00:35 - Intro to portfolios 01:42 - Modern portfolios 02:27 - What to include 04:38 - Importance of visual 05:50 - The "About" page 06:25 - Tools 08:12 - Future of "Minisodes" Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. From Portafoglio to Eportfolio: The Evolution of Portfolio in Higher Education GIMP AlternativeTo Jekyll GitHub Pages Minimal Mistakes My portfolio…
Darryl and I discuss his background, how he became interested in machine learning, and a project we are currently working on investigating the penalization of polysemanticity during the training of neural networks. Check out a diagram of the decoder task used for our research! 01:46 - Interview begins 02:14 - Supernovae classification 08:58 - Penalizing polysemanticity 20:58 - Our "toy model" 30:06 - Task description 32:47 - Addressing hurdles 39:20 - Lessons learned Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. Zooniverse BlueDot Impact AI Safety Support Zoom In: An Introduction to Circuits MNIST dataset on PapersWithCode Clusterability in Neural Networks CIFAR-10 dataset Effective Altruism Global CLIP (blog post) Long Term Future Fund Engineering Monosemanticity in Toy Models…
A summary and reflections on the path I have taken to get this podcast started, including some resources recommendations for others who want to do something similar. Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. LessWrong Spotify for Podcasters Into AI Safety podcast website Effective Altruism Global Open Broadcaster Software (OBS) Craig Riverside…

1 HACKATHON: Evals November 2023 (1) 1:08:39
1:08:39
Прослушать Позже
Прослушать Позже
Списки
Нравится
Нравится1:08:39
This episode kicks off our first subseries, which will consist of recordings taken during my team's meetings for the AlignmentJams Evals Hackathon in November of 2023. Our team won first place, so you'll be listening to the process which, at the end of the day, turned out to be pretty good. Check out Apart Research , the group that runs the AlignmentJamz Hackathons . Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. Generalization Analogies: A Testbed for Generalizing AI Oversight to Hard-To-Measure Domains New paper shows truthfulness & instruction-following don't generalize by default Generalization Analogies Website Discovering Language Model Behaviors with Model-Written Evaluations Model-Written Evals Website OpenAI Evals GitHub METR (previously ARC Evals) Goodharting on Wikipedia From Instructions to Intrinsic Human Values, a Survey of Alignment Goals for Big Models Fine Tuning Aligned Language Models Compromises Safety Even When Users Do Not Intend Shadow Alignment: The Ease of Subverting Safely Aligned Language Models Will Releasing the Weights of Future Large Language Models Grant Widespread Access to Pandemic Agents? Building Less Flawed Metrics, Understanding and Creating Better Measurement and Incentive Systems eLeutherAI's Model Evaluation Harness Evalugator Library…
In this minisode I give some tips for staying up-to-date in the everchanging landscape of AI. I would like to point out that I am constantly iterating on these strategies, tools, and sources, so it is likely that I will make an update episode in the future. Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. Tools Feedly arXiv Sanity Lite Zotero AlternativeTo My "Distilled AI" Folder AI Explained YouTube channel AI Safety newsletter Data Machina newsletter Import AI Midwit Alignment Honourable Mentions AI Alignment Forum LessWrong Bounded Regret (Jacob Steinhart's blog) Cold Takes (Holden Karnofsky's blog) Chris Olah's blog Tim Dettmers blog Epoch blog Apollo Research blog…

1 INTERVIEW: Applications w/ Alice Rigg 1:10:41
1:10:41
Прослушать Позже
Прослушать Позже
Списки
Нравится
Нравится1:10:41
Alice Rigg, a mechanistic interpretability researcher from Ottawa, Canada, joins me to discuss their path and the applications process for research/mentorship programs. Join the Mech Interp Discord server and attend reading groups at 11:00am on Wednesdays (Mountain Time)! Check out Alice's website . Links to all articles/papers which are mentioned throughout the episode can be found below, in order of their appearance. EleutherAI Join the public EleutherAI discord server Distill Effective Altruism (EA) MATS Retrospective Summer 2023 post Ambitious Mechanistic Interpretability AISC research plan by Alice Rigg SPAR Stability AI During their most recent fundraising round, Stability AI had a valuation of $4B (Bloomberg) Mech Interp Discord Server…
Добро пожаловать в Player FM!
Player FM сканирует Интернет в поисках высококачественных подкастов, чтобы вы могли наслаждаться ими прямо сейчас. Это лучшее приложение для подкастов, которое работает на Android, iPhone и веб-странице. Зарегистрируйтесь, чтобы синхронизировать подписки на разных устройствах.