Artwork

Контент предоставлен Rob Collie and P3 Adaptive. Весь контент подкастов, включая выпуски, графику и описания подкастов, загружается и предоставляется непосредственно Rob Collie and P3 Adaptive или его партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.
Player FM - приложение для подкастов
Работайте офлайн с приложением Player FM !

The Origins of Power Query, w/ Sid Jayadevan and Miguel Llopis

1:10:30
 
Поделиться
 

Manage episode 287525082 series 2798195
Контент предоставлен Rob Collie and P3 Adaptive. Весь контент подкастов, включая выпуски, графику и описания подкастов, загружается и предоставляется непосредственно Rob Collie and P3 Adaptive или его партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.

It was an honor to sit down with two of the creators of Power Query, Sid Jayadevan and Miguel Llopis. You get to hear the history of Power Query from their perspective! These guys are so busy it took quite a bit of coordination to get them together to record this episode. We hope you enjoy it as much as we did! References In This Episode:

Reese's Peanut Butter Cups Old School Commercial

Office Space Jump To Conclusions Hitler Hits A Breaking Point With Tableau PQ Diagram View (Preview)

Episode Timeline:

  • 4:45 - The Origin of Power Query, and the importance of M
  • 19:00 - How the various Microsoft departments come together for a big project
  • 30:15 - The many uses of PQ and the Power Platform, error handling in PQ, and some solutions to some common problems
  • 53:30 - What's next for PQ and Dataflows

Episode Transcript:

Rob Collie (00:00:00): Hello Friends, this week's guests are Miguel and Sid, the data integrations team at Microsoft. They were both around and involved when Power Query essentially emerged from the primordial ooze at Microsoft. And that's what this week's episode is mostly about, which is, what is the Power Query origin story? I've long been fascinated by that question as an outsider and observer, since this all happened after I left Microsoft. It's kind of hard to imagine a time when we didn't have Power Query. But there was actually three or four years in there, when we were building DAX data models for clients without benefit of Power Query. And this is where I use my old man voice. Kids these days, they have no idea how easy they have it, back in our day, we didn't have fancy Power Query, you just had to cobble the data together by hand. And so I'm not just fascinated by the origin story here, I'm actually deeply impressed and appreciative, especially as a former software engineer, who knows how challenging it is, just how gorgeous Power Query really is.

Rob Collie (00:01:06): And because of that, along those lines, during this conversation, I kept trying to get these two gentlemen to take the victory lap. They didn't take that bait, too humble, too cognizant of the work yet to be done, which of course, is really how we would want it, isn't it? Because seriously, it's good to know that there are people like this at Microsoft, who are stewards, really, of our futures. They're the ones building, not just the tools that we already use, but the tools that we're going to use in the future. There are a few places in here where we geek out a little bit as computer scientists, but mostly, the conversation stayed very firmly rooted in the human beings, the human element again, which is what it's all about. I'm sincerely honored that they took two hours out of their busy schedules to spend speaking with me and speaking to you on this podcast. I hope you enjoy it. I hope you learn something from it. So let's get into it.

Announcer (00:02:03): Ladies and gentlemen, may I have your attention, please.

Announcer (00:02:07): This is the Raw Data by P3 Podcast, with your host, Rob Collie. Find out what the experts at P3 can do for your business, go to powerpivotpro.com. Raw Data by P3 is data with the human element.

Rob Collie (00:02:23): Welcome to the show, Sid and Miguel, how are you today fine gentlemen?

Sid Jayadevan (00:02:27): Very well, thank you. Many thanks for having us.

Miguel Llopis (00:02:30): Yeah, doing great, Rob. Thank you so much for having us, it's a pleasure.

Rob Collie (00:02:33): Seriously, to get a hold of the two of you, and coordinate calendars and make this happen, it's an honor for us. So this was something that almost from the moment we launched the podcast, asking members of our team at P3, what would be interesting topics. One of them that just sort of keeps insistently coming up is where did Power Query come from? And so I was on this mission to hunt down some of the people who could speak to that. And there's a lot of things we can talk about. What are your roles today at Microsoft?

Miguel Llopis (00:03:06): Sid, do you want to go first?

Sid Jayadevan (00:03:07): Yep. So I am an engineering manager at Microsoft Manage, the team that works on data integration, with Power Query being one of the key elements, but we have a variety of other things we do with connectors, gateways, data flows, all of which are very symbiotic. Power Query depends on all of those things, and all of those things depend on Power Query. So there's a larger space that we operate within, but Power Query has been around the longest of all of those things.

Rob Collie (00:03:39): So I didn't know this going in. But I have some complaints about this QuickBooks connector. I'm sure you know what those complaints are too. They will not be ones that you've heard the first time from me making a note of that.

Sid Jayadevan (00:03:55): No points for guessing.

Rob Collie (00:03:56): No points. Okay. Miguel, what are your responsibilities today?

Miguel Llopis (00:03:59): I'm the program manager lead for Power Query, and connectors and data flows. So a bunch of technologies and experiences that Sid was talking about. Been up on the team for quite a while, maybe about the same as Sid, maybe a bit less. He's older than me, so that would show through this interview.

Rob Collie (00:04:17): Oh, yeah. Well, we're only recording the audio here, you can't really tell.

Miguel Llopis (00:04:22): I meant from his wisdom, not from the look.

Rob Collie (00:04:24): Oh, I see.

Sid Jayadevan (00:04:26): That's Miguel's way of saying I'm Palaeolithic.

Miguel Llopis (00:04:30): There you go.

Rob Collie (00:04:30): Oh, yeah. I see. You're also a free range gluten free?

Miguel Llopis (00:04:35): Of course.

Rob Collie (00:04:39): So you both go back aways on Power Query specifically. So from my perspective, from outside the company, I left Redmond in 2009. And then I sort of formerly left Microsoft in February of 2010. So I've been formally gone for 11 years, and sort of informally gone for a little longer than that. It's really kind of hard for me to even get back into the before mindset, when we had DAX, and data modeling, and we had the VertiPaq Engine, they're all still so incredibly central to Power BI today. And we had them in the Power Pivot form, we didn't even have SSAS Tabular yet.

Rob Collie (00:05:16): But we had nothing in this giant void that Power Query came along, to me, basically, out of the blue. I had no advance notice of this. My sources in Redmond, my spies, they hadn't hooked me up with the information that something amazing was coming to come along and complement the tools. There are so many times, guys, seriously so many times where I or we would be working with a client and we would know what the ideal data model would look like. But they couldn't do it because the data that they were getting wasn't in the right format to build the right data model. Even something as simple as you need a lookup table or dimension table, and no one's giving it to you.

Rob Collie (00:05:58): So it was always like, "Oh, now you need to go find a DBA," that you're either lucky or you're not, you either had one or you didn't. And if you were unlucky, you were just out of luck, there was no recourse. You were just going to have to make these changes to these tables manually. There was no automatic refresh anymore. It was really a tremendous limitation. And then suddenly, and at the time when it first arrived, what was it? The first name? Was it Data Explorer?

Miguel Llopis (00:06:27): Yes, Data Explorer.

Rob Collie (00:06:30): Even that initial name sort of conveyed a different mission than what it sort of morphed into. I don't even really remember. It was also a way to connect to lots of different things, wasn't it? And Power Query is today. Were you around when Data Explorer, when that name was chosen? What was behind that?

Miguel Llopis (00:06:46): Yeah, that would be an interesting story. So actually, just going one step before that, and I think we're getting towards 2012 at this point, I think. The very first incarnation of what today's Power Query in market was something called SQL Azure lab for data exploration. It didn't even have an actual product name, we're going to call it that way. It was SQL Azure Labs was a set of initiatives across the SQL and national teams back in the day to actually a spike different sets of technologies that will help in different segments, like data exploration will be one, data visualization. There were a bunch of things that came out that way.

Miguel Llopis (00:07:22): It was actually a full cloud based Power Query like experience to actually connect to data, transform data, and then output data in different ways. There's ways to actually get your data out as an all data endpoint that you could use maybe to create an app, maybe to consume from Power Pivot, to your point drop. The feedback back then, and again 2012 was, hey, Pivot is working in Excel, we want these experiences in Excel. So we repivoted all of that, no pun intended with repivoting, towards a client base experience that was actually an Excel add in that we initially released for Excel 2010 and 2013.

Miguel Llopis (00:07:59): There was quite a bit of naming related discussions, but I think the data exploration aspect, everyone got used to it. So we ended up coming up with Microsoft codename Data Explorer for Excel. That was a very first name for the Excel add in, which later than that, a few months later, as we went into GA, they actually got renamed to Power Query. And really the alignment there was with the power family, the power tools, Power Pivot, Power View, Power Maps back in the day as well. And then Power Query as a way to actually bring data in. Maybe Sid remembers some of these discussions more than me. I know there was also an alternative option, which was Power Import, that really, we went for the term query because it really reinforced the notion of repeatability and refresh ability of those queries, no pun intended.

Sid Jayadevan (00:08:45): I think we tested many, many terms, and import was just one of them. But we felt like the essence of the product was that ability to query ad hoc and at will, and so we really wanted to focus on that. And so that's how we ended up with the query piece of it.

Rob Collie (00:09:03): I think Power Query was a great name. Honestly, it didn't really land for me what y'all were giving us when it was still called Data Explorer. That name was actually a very large cognitive obstacle for me, explorer, it sounds like an analysis tool. Now, knowing where the roots were, you had a name, there's a more appropriate name for what you were doing, probably when it was still that Azure Lab thing. But it's so funny, this happens all the time, when your mission pivots, certain parts of it still kind of leak through, like the old name, by default.

Rob Collie (00:09:36): Just like the old story, whether it's true or not about the railroad tracks being the width that they are because the Romans chariot was that wide. It's just like, what we did yesterday is because we did it that way the day before and whatever. Certain things just have a momentum that carried forward. When the name changed to Power Query, suddenly, I was like, "Oh, okay, this is awesome." But it was Data Explorer, it was just a really cool novelty. I didn't have a sense of its purpose, or I didn't feel like it was a serious tool yet. It's really kind of interesting the power of naming, isn't it?

Sid Jayadevan (00:10:08): Neither did we, to a large extent, we were trying to find that identity. And I think as we got deeper into it, the query first aspect became a lot clearer.

Rob Collie (00:10:20): I also think the Power Query versus Power Import, I think the right decision was made there. The repeatability, you know this, but I'm going to say it anyway, if I only get five minutes with an Excel person who's never been exposed to the power platform, I have five minutes and I have to drop their jaw, I'm going to show him Power Query. I'm not going to show them the data model, I can show them DAX. Now, I think that ultimately, you absolutely need to be using both. But Power Query is such an amazing life changer for the Excel crowd and they can immediately appreciate what it's going to do for them.

Rob Collie (00:10:57): It's harder for them to appreciate what the data model is going to do for them. Which is why, if I've got five minutes, I don't go subtle. Isn't it amazing? You're talking about figuring out your own mission over time, that was something amazing. It sounds like from what you're saying that the M language, and the engine that goes with it, and all the stuff that's really difficult to build and to design, a lot of that was already kind of done before the Excel crowd became a focus. Is that true?

Sid Jayadevan (00:11:28): Yeah, that's true. Power Query is essentially a visual interface on top of the M language. The M language is absolutely the essence of the product, it's the foundation. And that foundation was built before the product as was often necessary, you need the foundation in place. And there is a long history around M that predates 2012, by let's just say, several years. And we won't delve into all the details, but much like we had to get clear about what Power Query was for and who we were targeting, there was a similar process with M. M was from a technology point of view, this very simple, yet powerful thing, in that it was functional, it composed, in our opinion, reasonably well. And so you could do lots of different things with it. But we wanted to put it in the hands of lots and lots of people who didn't necessarily have not even a programming background, but necessarily a query background.

Sid Jayadevan (00:12:29): And so that was the goal that we set for ourselves to bring all of those people on board, to make things possible for them that perhaps were a little harder in the past. And so M was really the foundation, and it was well in place CIRCA 2012, we made some changes to make it more friendly to the visual experience, to make it just a little more designer tool friendly, if you will. But the core of the language was already in place. And on the language front, we had tried lots of different things. And many, many people at Microsoft were involved in that effort at many stages.

Rob Collie (00:13:08): Something you said there really struck me and I wouldn't have thought about it this way until I heard the history. You made some changes, just some almost cosmetic changes to the language to make it more friendly to the visual composer aspect of Power Query. As soon as you said that, I'm like, "Oh, my god, yeah, this tool is super, super, super friendly to being edited and written from a graphical tool." A lot of times, you can go back into the M code and hand edit it and the visual editor is still completely okay. It totally understands what you did. That sort of round trip of hand editing and visual composer, exposing both to the user and still having a language and a tool that survives that duality, that's a challenge. That's a really big challenge. I've tried it multiple times at Microsoft. And I think I'm old for a lifetime on having designed a system that worked like that. So I can certainly appreciate it from a software engineering perspective, even just that one little detail of a language that was already pre built. That's kind of amazing.

Sid Jayadevan (00:14:08): We're still working on it, it's very much a work in progress. And there are aspects that are a little more, what should I say, language oriented, that remain an M that remain extremely powerful that the visual interface doesn't leverage quite as much.

Rob Collie (00:14:25): I've been talking to Miguel about this. And it's not just a deep power, but I think the Power Query transform, there should be many more tabs, because the language is so flexible. And I know that not everything can be turned into visual, some things are just absolutely going to forever remain 100% in the realm of I have to hand edit the M. I'm really nothing but a fan here.

Rob Collie (00:14:45): I told Miguel on a previous call that back in the early 2000s, when I was on the Excel team, and I caught the XML bug. The first feature set that I was a lead program manager for was the XML import and export capabilities. Not the XML file format, but data payloads, invoices or whatever, be able to move those in and out of Excel. And I spent two, four years of my life chasing a dream that I call Data Merge. There are huge, elaborate graphical mock ups of all of this, it was a really ambitious. It's exactly the kind of project you'd expect a young software engineer to get all amped up about and geeked out about.

Rob Collie (00:15:27): The thing I thought we needed to do was build some sort of repeatable data transformation logic into Excel. And I tried so hard to get budget, to get approval, to get greenlit to build a team to do this. And I got shot down four times a year for multiple years, every quarter, I make another run at it, like, "Come on, let's do this." Now that I've seen what you built, I am so glad that I never got approval to dive into that. Because once I've seen what it looks, a complete solution to it, I realized, oh my gosh, we were so overmatched. We would have never, ever succeeded, never come close. The fact that you had to go build a language first, makes sense to me now in hindsight, but it's really chilling. Oh my gosh, imagine they let me do my passion project. Thank you, Richard McAniff for never believing in me.

Miguel Llopis (00:16:24): Well, I'm sure you-

Sid Jayadevan (00:16:26): Yeah, not so sure about that, Rob. You might have done a lot better.

Rob Collie (00:16:30): I doubt it.

Sid Jayadevan (00:16:31): [crosstalk 00:16:31] Well, we'll never know.

Rob Collie (00:16:33): Well, I do, and I wouldn't have it.

Miguel Llopis (00:16:36): Well, Rob, I think you're being too hard on yourself. We don't have answers to everything. I assume you would have just like us, just fail fast, learn fast, iterate, learn from customers, learn from you, Sid, and just refine and get better over time. Here we are 10 years later, or eight years later, and we still have a lot of things to improve on.

Rob Collie (00:16:56): It's not even really just about me, it's also that Office doesn't have the right culture, to do something like what you've done. We would have gone and tried to solve a handful of simple cases, that's what would've happened under scheduled pressure. And we would have gotten committed to a system that wasn't elegant at its core. And then we never would have been able to really scale it to address... Because you know how it is, if you address 99% of problems that people have, it doesn't matter, that 1% is still going to plague enough of their workflows, that is the difference between they can adopt your tool or not. You've really got to be complete. We would have never been complete enough. And I can say that with confidence, knowing myself at the time, and also knowing the culture that was around me. We would have never gone and done the right thing, we would have hacked it, and we would have paid the price. And it's not just a question of me not being up to the challenge, just organizationally, we weren't at the right place.

Rob Collie (00:17:48): Office has really leaned into Power Query, it's a core part of the Excel ribbon now, basically taken over the prime real estate. So I think they're absolutely in on Power Query, and they're absolutely in on the value that it brings. It's just that they're not the right place to have invented it. In the same way that Office wasn't the right place to invent DAX, it's just not what Office does, Office does other things.

Rob Collie (00:18:12): If I wanted to turn this around and say, the historical struggles of the data side of the house has been that there haven't been traditionally as good at user experiences as Office was. But that gap is really closing, that has become an engineering discipline on your side of the house that it really wasn't when I was there. I used to describe Microsoft as there were user teams and engine teams, and there was no such thing as a team that was both. Office was the user team and the data platform, they built engines. But the engine team couldn't build user experience and the user experience team couldn't build engines. And so I think that's changed a lot. And this is a great example of it.

Sid Jayadevan (00:18:48): I mean to the point about Office and Excel, one of the things that has been a little different with Power Query is that we've embraced the open source model, perhaps a little bit more than for other products like that. We have the Office team contributing very heavily in our code base, not all aspects. It's the Power Query team that drives the majority of changes, but the Excel team is very, very involved. In fact, if you look at a lot of the developments around Excel on the Mac, the Office team has contributed very heavily to that. And so that ability to have other teams come in and make changes and they've really been a poster child for this on the Excel side, that has helped build Power Query into more of an ecosystem even within Microsoft.

Rob Collie (00:19:38): I didn't know that actually, I really had no idea that there were Office engineers contributing code. I just sort of naively I guess, assumed it was a one way street. You guys were sending them a build update every now and then and they were ingesting it.

Sid Jayadevan (00:19:51): And that is fundamentally how it operates because we do want everyone within the larger Microsoft ecosystem to be benefiting from the same enhancements, so there is a build that goes out every month for Excel desktop. But we also have a lot of teams across Microsoft who are contributing in a fairly big way.

Rob Collie (00:20:11): In terms of, we talked about the M language and all of that. I just told you the story about never getting the chance to do Data Merge on the Office team. I'm really deeply curious about how the M language got greenlit, how did the need for it get recognized and bubbled up into something that they got resources. Because like I just said, it's such a crazy thing. If you haven't experienced the pain of the world, in terms of automatically munging and transforming data, if you haven't experienced it, and most people have it, even at Microsoft, most people haven't experienced that, trying to convey that pain to other people is very, very difficult. I look at Power Query as, look, this is something that the world needed, not just like a demographic, this is something had improved the world. And yet, I know from experience, it's very, very difficult to explain to people who are already on board, what the value is. Is there anything that we can talk about there?

Sid Jayadevan (00:21:06): Without getting into all of the details, we went through a number of iterations to get to where we are and where it started was with some precursors to M, which were more about modeling, what we set out to do. And there's a large number of people who contributed to this. And so some of this predates some of our contributions. I've been involved with the project on and off, in fact, I left at some point and came back to it. And a lot of the seeds of the project were in modeling related efforts. So ways of modeling your data, modeling your relational data model. And as I guess, in hindsight, could have been expected, folks started to realize that a lot of what you needed to do to have a successful modeling environment was enable transformations as a first class thing. And so at some point, you had a language that was a little bit of data modeling, and a little bit of transformations layered on top of that.

Sid Jayadevan (00:22:12): And frankly, over time, we talked about how the query thing became more and more important and became the essence of the product. The data modeling side of things faded to some extent, and the focus shifted towards transformation. And it shifted towards transformation of all data. There was a period, not just at Microsoft, but in the industry where very focused on data as not necessarily a silo but homogeneous data stores. And when that heterogeneity of data became a reality that no one was going to change, the focus of tools like ours, and languages like M shifted more towards that ability to embrace all kinds of data, wherever it might live, of course, change the language and give us what we have today.

Rob Collie (00:23:04): This will show how old I am, there used to be a series of commercials for Reese's Peanut Butter Cups, where two people would be walking along, one of them will be carrying a chocolate bar and one would be carrying an open jar of peanut butter for some inexplicable reason. And they'd bump into each other and the two would accidentally mix. And then they'd accuse each other, "You got your chocolate in my peanut butter." "No, no, you got your peanut butter on my chocolate." And then they would take a bite of it, go, "Oh, my God, this is the best thing." It kind of has that feel to it, doesn't it? The origins of M and Power Query, it's not like there was this anticipated union with some DAX and what we think of as the VertiPaq, Power Query data model. That wasn't a mission statement from the beginning, it's just these two things ended up going together super, super, super well, sort of an accidental union. That's been my sense of it forever. Is that true?

Sid Jayadevan (00:23:53): I think that's a very fair assessment. Miguel, what do you think?

Miguel Llopis (00:23:59): Yeah, I tend to agree. I was actually thinking about the previous comment you made about the heterogeneous nature of the data space right now. So yeah, really, when you talk about big data, it's not really only about the volume of data, there's also the variety of data, both in terms of the sources you connect to, the schemas they have, the different keys on either side, and the need to use things like fuzzy matching and mapping tables and whatnot.

Miguel Llopis (00:24:22): And then lastly, is also about the velocity of the data, there's some data that changes once a day, there's some data that changes once a quarter, there's some data that changes multiple times per second. And so providing tools for non technical users, which is the vast majority of people in the world to actually be able to do this efficiently and with ease and that even for somebody who can do the hard thing, of course, who wants to do the hard thing if you can do it much simple ways. I think that that was key to us and just democratizing this whole problem space and of course, there's a lot more that we can do.

Miguel Llopis (00:24:55): And thanks, Rob for your list of suggestions from the team. We love those and within our team, we do have this whole bucket of what we call customer law, which is about, "Hey, give us a problem that you've ever tried to solve with Power Query and it didn't actually make the cut for you. And I will try and generalize that and give you a feature out of it." That's how many of our existing transforms came about.

Rob Collie (00:25:17): It's just such a rich canvas. When you start from a language, you have a lot of future flexibility in what you can do, it's awesome. The heterogeneity thing again, I also really reacted to that, that speaks to me. Another thing that shows my age, I grew up during the peak Cold War between the US and Russia, or NATO and the Warsaw Pact, whatever. And so I read a lot of Tom Clancy and I was one of those kids.

Rob Collie (00:25:44): Something that really strikes me from that is that the two different philosophies of the 1970s, 1980s, Russian military strategy versus the United States is, you see it in every morning or every day of operations at a United States Air Force Base. Everybody at the Air Force Base, gets out in a big long line, and walks the entire length of the runway, picking up pebbles, and all kinds of foreign objects from the runway, because if any of that gets sucked into the intakes of these really sensitive airplanes there's going to be hell to pay, it's going to break it, it's going to go down. Whereas, the Russians built everything that they had, at least in theory to eat mud.

Rob Collie (00:26:23): I think the old world of BI was that 1980s American strategy. You had to have this absolute clean room. It's ideal, frictionless circumstances in order for everything to work right, which is, of course, it's completely unrealistic. The real world is dirty, it is noisy. There's chickens running across the runway, it's not just pebbles. Is there even a runway? And this wave of Microsoft tools, the Power BI beating heart, which Power Query is part of it, I mean, it is built for that real world messy, dirty reality. It's not the kind of thing that you imagine when you're sitting around in a whiteboard doing computer science. And when computer science can meet that kind of reality and perform, it's really something to behold. It's just a whole new era, isn't it?

Sid Jayadevan (00:27:16): Yeah, I couldn't agree more. It's messy and dealing with that messiness is still very much a work in progress. But that's the thing we're trying to embrace, that messiness that isn't going away anytime soon.

Rob Collie (00:27:29): It only gets messier, even our company.

Miguel Llopis (00:27:31): But at the same time tools get better and smarter. So how can we actually make it so that it's even easier and easier for you to do these things with Power Query in this case?

Rob Collie (00:27:41): Yeah. Some of the things that you can start to do with machine learning and AI to write the code for them, there's some scary stuff that can be done there. A no column by example is sort of the most straightforward poster child for that kind of thing. I do want to at least make one joke with you, which is actually the truth but it's funny, is that back when I would teach classes, we still teach a lot of classes, but they don't let me teach them anymore, because I'm not as good as the people on our team. But whenever I bring up M, and I would show people the code, and so we'd using Power Query a little bit, and then I'd show them the code that it was generating. And then I would zoom in on that code. And I'd say, "And this word here at the beginning, tells you everything you need to know about where this thing came from." The first word of every Power Query script being the word let, I just talked about like the messy real world reality.

Rob Collie (00:28:31): But the word let at the beginning of every Power Query script, tells you this came from the ivory tower. I look at the class and say, "It's almost like a philosopher smoking a pipe, who then says to you, 'Suppose.'" What if we pause it, and then the script starts? Like I said, I admire what M can do. But the M language itself doesn't speak to me in its raw form. I look at it and I kind of want nothing to do personally with editing it. A lot of people, especially on our team, they do, I'm just one of those people that's like, for whatever reason, I was willing and able to learn DAX and I typically don't learn stuff, I don't learn new tool sets, I don't learn new languages. The fact that I learned DAX is really an outlier for me. I'll never learn M, not in its raw form. I'm a button pusher, dyed in the wool.

Sid Jayadevan (00:29:25): And then we want to cater to all the constituencies, the folks on your team who wanted at the end, we want to make that possible. And for the many folks who would rather press the buttons, for that we have the visual interface.

Rob Collie (00:29:41): Do you have those personas behind the scenes where you talk about the person who only wants to push buttons, you have the unsophisticated user of M persona. Can we just name it Rob and I'll give you a picture of a me going...

Miguel Llopis (00:29:53): Actually, I call dibs on that one because I'm that kind of person as well. And that's what I would push for most of the time.

Rob Collie (00:30:00): Damn it. And you being on the team, you got an inside track to be the persona. All right. Well, listen, I'm waiting in the wings. I'll be your understudy. So how much of the two of you got an exposure to the next part of the chain, which is, do you sit around building Power BI models? Do you write DAX? Do you build data models?

Miguel Llopis (00:30:20): Yeah, big time. I mean, we use our tools, the tools that we build, we use them internally for for example, understanding how users are using our products or understanding our backgrounds and our feature tracking report, you name it. Not to talk about personal projects, I do have my personal projects with Power Query and Power BI as well for non work related stuff. And that's actually, in my experience of this is, to me has helped me the most actually understand internalize all of the end user pain points around this area and actually push the tool to actually become better. And I know Sid does quite a bit of this as well.

Sid Jayadevan (00:30:57): Yeah. The entire team does a large amount of eating our own dog food, dog food, and you've heard myself term for this. That's always been a very large part of what we've done. It's not just about using Power Query, it's about using in the context of all of the things that Power Query is hosted within. And so Power BI, of course, and Excel and Power Apps, and aspects of Azure, we try to ensure that we're experiencing the end to end experience as much as possible.

Rob Collie (00:31:29): It's just a complete divergence from the path we've been on. But I want to at least mention to you before I forget that, in the past seven days, last two work weeks, off and on I've been teaching a little bit of Power Query to a high school football coach. We're just kind of messing around for the moment with a Power BI through a pro bono project. It's just sort of a passion project of mine. I got to tell you, it's fun. This guy's eating it up. He's loving it. I'm showing him how to add error checking and things like that for when there's the temp Excel file still in the folder that he's trying to load from, that's going to mess things up. Well, you could filter that out and everything. And yeah, he sponging it up. It's just cool to see it. It's all these unexpected places, you see these tools end up being used.

Rob Collie (00:32:14): So both of you seem to have a lot of opportunity to sort of drive the race car that you build. And that was not something that I really felt like I had much chance to do when I was at Microsoft. It's like, we built race cars, we have no idea what it feels like to sit behind the wheel. And so it's always surprising to people that with whatever tool I've been working on, the customers were better at using it than I was. It's nice that there's a little bit more of a culture now of using the tools even for personal use. Personal use is fantastic, there's nothing better than personal use.

Sid Jayadevan (00:32:46): Absolutely. As Miguel mentioned earlier, it's for both hobbyist projects, pet personal projects, as well as internal day to day work. Love using it for all of those things. And Miguel in particular has I think some soccer things that he probably use it for, but I'll let him speak to that.

Miguel Llopis (00:33:10): Yeah, definitely soccer as well as a bunch of other things I wouldn't name. Yes, quite a few personal projects.

Rob Collie (00:33:17): It's really nice of you to call it soccer for us. I'm sure you don't call it soccer with your fellow soccer fans.

Miguel Llopis (00:33:24): Yeah. You mean with our football fans?

Rob Collie (00:33:24): Yeah. What are some of the craziest things you've seen? I'm sure that you've got just some really crazy stories of things that you've seen customers doing with Power Query that you never would have expected? Anything like that come to mind?

Miguel Llopis (00:33:37): Many things. So I guess could take crazy in a couple of dimensions. One could be unrealistic expectations on the tool or the technology. The other one could be tremendously complex projects. So I'll actually head down the second path.

Rob Collie (00:33:53): Sure. Let's do that.

Miguel Llopis (00:33:54): I think the biggest Excel workbook with PQ queries I've ever seen, had probably about 280, 290 queries on it. I'm glad we introduced query groups as a feature because that person will be there in the world without them. But even there, it's a pretty heavy to maintain project.

Rob Collie (00:34:13): And the dependency.

Miguel Llopis (00:34:14): Yeah, I was going to say understand query dependencies. So you do have some support for that in Excel today with query dependencies. We're working on way more interactive, highly visual experiences that eventually will make their way into Excel. But as of now available in the Power Query online experiences with what we call the Diagram View, which is currently in public preview.

Sid Jayadevan (00:34:35): 290 queries.

Miguel Llopis (00:34:36): Yep. And they're all legit, that we literally sat together and say, "Let's simplify this." And actually, yeah, it could have combined a few things, but it actually made sense the way he had it organized.

Rob Collie (00:34:47): And is the endpoint of that data landing in Excel?

Miguel Llopis (00:34:52): Yes, it was inside an Excel workbook.

Rob Collie (00:34:53): Wow. Wow. You don't have any examples of people using Power Query or data flows to automate their home? For example, I have a friend of mine right now, who is setting up using Power Automate, he's setting up where if he gets a text notification from a certain Internet of Things system, it will go in and adjust the temperature gauge, the thermostat, turn on heaters, turn on humidifiers, things like that. It's a terrarium, he needs to maintain the balance in this biosphere that he's built. And he's got monitors in there, but all they'll send them or text messages. That's all he can get. But he's like, "No problem, I'll eat those text messages and feed him into the power platform. And next thing, we're adjusting temperatures and humidity and all that kind of stuff." I bet there's a lot of stuff out there like that, it's data transformation but analysis isn't the endpoint. It's being used for something else.

Sid Jayadevan (00:35:54): We're blown away by a lot of the creativity, seen a lot of these very self regenerative programs that people have created, where the queries adapt and do all kinds of things. It's a ton of creativity.

Rob Collie (00:36:11): One scenario, and now we're doing the program manager feature design thing. And one scenario that I've wondered about for a while is failures in a Power Query, the error handling. Using the moment of error, harnessing that, and activating a human workflow to address it. The way you're nodding, this is not the first time this idea has come up, right?

Miguel Llopis (00:36:38): Yeah, I was wondering if I had mentioned some of that stuff to you. Because today, within the Power Query Editor experiences, you do get some help with data profiling features, you understand duplicate values, you understand errors. To some degree at least within the data in the preview that was for you to run that over the entire data set. But nothing really helps you with, after you save that, and you say, "Yeah, refresh this thing every day at 8:00 AM." With understanding if that still is correct, if you get a new outlier value, if you get a new duplicate value, and you get some errors around that. That's one of the areas that we're looking at. And it goes back to the thing we were talking about earlier about, how can we further simplify this tool and make it more productive for the real users of it on a day to day basis. And this is clearly one of those areas. I mean, if you're putting together a report or a dashboard for your boss, you want to make sure that they don't start looking at the wrong data without you even knowing.

Rob Collie (00:37:32): Oftentimes, it manifests in some very sinister ways. Like if a data source succeeds in refresh, but it feeds you back nothing but zeros. [crosstalk 00:37:42] There's no runtime error. And then, of course, if you saw a report with nothing but zeros on it, you'd notice, you say, "Oh, clearly, this thing's dead." But if those zeros are only one leg of a five leg platform that makes a single metric, the answers you get on your report can still be credible.

Miguel Llopis (00:38:00): Yes, that is a problem.

Rob Collie (00:38:03): And I'm speaking from experience, I've been burned by exactly this sort of thing in the past. Even when there's a runtime error, it's almost always a human being that has to go do something. If a duplicate key comes in, that wasn't there before, what do I do about that? I have to-

Miguel Llopis (00:38:19): Would it be nice if we just fix it for you? Or if we maybe ask you, "We saw an issue and this is what we think you might want to do." And we give you a couple of options. And maybe you don't even have to go to the tool, maybe there's a quick text message you get, maybe somebody is giving you a phone call while you're driving, maybe it's an email that comes in and just with a couple of clicks, you can just get it fixed.

Rob Collie (00:38:41): This are all good ideas. I like this. This sounds promising.

Sid Jayadevan (00:38:44): One thing that we recently added in this space was integration with power automate. So that's more on the data flow side. And it's early days for that, but we've already seen some very interesting solutions. One of the things you can now do is have your data flow include a bunch of these reports for issues that you mentioned, you could perhaps partition off the errors or have a bunch of litmus test queries that check the data quality. And if those queries start yielding results, you can fire a power automate that can engage whatever workflow makes the most sense for you. Whether it's sending an email, whether it's writing something out somewhere for someone to take action, going all the way to sending someone a text message. All of those things are possible. They're perhaps not as frictionless and out of the box as they could be, but we're making some of those things more possible.

Rob Collie (00:39:42): I think that problem of merging the automation with human like referees of the occasional error is probably as ambitious of a problem to address as Power Query was originally. I've got a lot of respect for that problem, placing myself in your shoes. Might not be that quite that ambitious, but it's a large problem. It's a product level problem to solve as opposed to a feature. Every now and then like, I get some data where someone keyed in an exclamation point instead of a one, because their shift key was down, and all hell breaks loose over that exclamation point.

Rob Collie (00:40:24): You got a hard job, the error tracking in your system, it's many levels deep. We all know the experience of you get the error, and the top 11 errors all say exactly the same thing. And you scroll through the list to get to the one at the bottom that tells you hopefully, what really happened before the downstream errors happened. It's hard to bubble up the right error to the right person at the right time when almost by definition, you don't know, you can't anticipate what this error is going to be, you have no idea what's going to come in. So I recognize this as sort of a frontier for you, but I do not mean to trivialize it at all. It's only an improvement. It's not like you need to do this otherwise, everything you've done is... No, you can stop today completely and Power Query is arguably complete, you just have so many places where you could-

Miguel Llopis (00:41:16): Go and tell that to Satya, we want to still keep our jobs. Got to find new challenges.

Rob Collie (00:41:20): Well, next time I talk to Satya, next time he calls me up for advice. Yeah, I think it would be a shame if you did stop. It's a compliment to what you've got, that if you stopped today, it's already well past amazing. I'd say to students and clients that there are two engines at Microsoft, two data engines in particular, that all of Microsoft's competitors wish they had it instead. What are you going to call the DAX and data model VertiPaq. Microsoft is not very good at naming, I don't know if you all know that. And then the other one is the M engine, the Power Query engine, which also by the way, goes nameless in all of your products. It's just get data or import or whatever now, getting transformed.

Miguel Llopis (00:42:03): It's the M engine and Power Query is the experience.

Rob Collie (00:42:06): These two engines, wherever you call them, they belong in the software Hall of Fame. I believe that. And this is a very vicious critic of software, who's talking to you right now. I hate software. And these two things, they demand your respect, it's got to feel good to have been involved in something like that from such an early stage. It's got to be one of the most gratifying sorts of experiences for a software engineer because most of the time, it's not like that.

Miguel Llopis (00:42:33): This is such a tough interview, Rob.

Rob Collie (00:42:36): To make you guys feel all gushy about yourselves.

Miguel Llopis (00:42:43): Yeah. Don't know what to say [crosstalk 00:42:44].

Sid Jayadevan (00:42:43): That's very kind of you.

Rob Collie (00:42:43): Oh, come on, you've lived it. Right? You've probably also lived as software engineers, you probably lived the other kind of project too. There's all kinds of dead ends in software that you can chase them for years.

Sid Jayadevan (00:42:54): I think one thing that's been a big differentiator with this one is, so Miguel and I are here today, but there's a team that has stuck together over an extended period of time. And it's the most fun I've had in my time at Microsoft. I'm very, very fortunate to work with those folks. For a problem like this, there is a kind of continuity that becomes necessary to... You talk about the iteration and needing to keep going. And we have a lot of work ahead of us.

Sid Jayadevan (00:43:24): But the thing that has made this easy and fun, at least from my point of view is the team has been phenomenal. You tend to have a lot of churn on teams, and you go through phases, and people come and go. But this has been one where there's a set of fun. And I'm not talking about a handful of folks, it's probably a few handfuls of folks who really pushed on this over many, many years. I think that's one thing that's been a little different vis-a-vis a lot of other projects, that there's been a set of folks who have stuck with it and have been incredibly passionate about it. And that's been a big part of Power Query.

Miguel Llopis (00:44:02): Completely agree.

Rob Collie (00:44:03): Some products really require that kind of continuity in order to continue being successful. Excel, by the way is one of them. I think Excel, I don't really know what it's like today, but when I was there, there was pretty healthy turnover every release on the Excel team. And the developers, the engineers, the actual writing the code, they had a bit more continuity, actually quite a bit more than the program managers. It was every two years, the school bus would drive up, all the program managers we get on, it would leave, new school bus arrives with younger program managers and would drop them off. I got off that bus one day and enjoying the Excel team. And the engineers on the team were just like, "Ah, the new youngsters, we got to train these people now too."

Rob Collie (00:44:55): It was a year and a half of working on Excel before I stopped coming up with feature ideas, like wouldn't it be cool if Excel could do this. It was a year and a half before I stopped coming up with ideas like that, where they'd look at me and say, "Yeah, we already have that." Honestly, I think that culture, that continuity was enforced more by a handful on the Excel team when I was there, they were keepers of the flame, if you will. And there was like one on the program management team, half a dozen on the dev team.

Sid Jayadevan (00:45:27): And you have a lot of those projects where you'll have one or two keepers of the flame. And I think what's been unusual with Power Query, at least compared to other projects I've been on is that there have been many, many keepers of the flame. And of course, you want fresh ideas. So you want people to be coming in and bringing those ideas, and we've had a lot of that as well. And so there's been keepers of the flame, there have been challengers of the flame in a very good way. So we've had that mix. But there has been a lot of good cohesion.

Rob Collie (00:45:59): It sounds a good title for a Kickstarter funded board game, challengers of the flame. I'll tell you what, well, you all get equal rights. We'll call it a common intellectual property, that name. I'm here by seizing 1/3 ownership in Challengers of the Flame, LLC.

Sid Jayadevan (00:46:19): What was the board game at the end of Office Space, the jump to conclusion board game?

Rob Collie (00:46:28): I don't actually remember, I've seen that movie so many times. Now, I've got an excuse to go watch it again. Tell my wife, "Listen, this is important. This is for work."

Sid Jayadevan (00:46:38): That was our quandary, what do you name the thing?

Rob Collie (00:46:43): So how much commonality is there, I'm assuming a lot, between data flows and the version of the M engine that lives in Power BI?

Miguel Llopis (00:46:55): Basically is the same engine. So data flows, the way I like to talk about this is layers of the onion. So if you think about the M engine as the core of the onion, then the next wrapper around that is the Power Query experience that allows you to create queries that run in M. Outer layer on top of that is really the data flows, which really automate and orchestrate many different sets Power Query projects that were defined with a Power Query experience to generate M that runs.

Miguel Llopis (00:47:23): So whereas you could have a data flow that maybe brings say, your customers data, your customers table. Or your customers entity, you may have another data flow that connects to that customers entity and then maybe does a bunch of additional Power Query and M query transformations to do your customers who are most likely to churn. And it's the orchestration of that whenever that customers table gets refreshed, cascade refresh everything else that depends on it. That is what data flows are.

Rob Collie (00:47:55): That makes sense to me. One of the challenges that I know that Power Query faces is that at tremendous scale, when the data is just gigantic volumes, the elapsed time of a query can get up there. And it's just an optimization thing. It's almost like the ideal software problem to have as engineers. How much progress has been made over the years? I haven't really been paying much attention to it. I just remember from the very early days, people saying, "Okay, it's great, but we can't use it for the 500 million row data set, going through a Power Query, just takes too long." Have there been any strides made? Again, I'm really sympathetic to this, it's a really hard problem. Power Query has to process every single row, it can't do the things like the VertiPaq Engine does where it sort of groups rows into clusters and treats them as one band of rows, you don't get those really nice columnar in memory tricks when you're performing transformations. So you're kind of up against physics in a way.

Miguel Llopis (00:48:56): Yeah, great point. So there's actually two avenues we can take to answer that question. I'm going to talk about both, I'll just call them out. And then I'll answer the easy one and I'll let Sid answer the hardest one. One is about increasing the scale of what you can process with Power Query. And of course, you need to do that. But on the other extreme, there's also the make it clear to the end user clicking those buttons as Drobo usually does, that there is a problem, and so that they can correct that problem before it actually becomes the root cause for things that are many, many steps further down the pipe.

Miguel Llopis (00:49:29): And so on this area on making things more clear to users, we're actually introducing quite a few new features. We just announced something called the step folding indicators. So it's a feature we recently launched inside Power Query online, inside data flows that as you connect to a data source, let's say, SQL Server, and you connect to the customers table, and then you apply a filter to maybe say exclude customers in the US, then you get your filter versus they will actually give you a tick next to it to say, "This has actually been pushed down to SQL because SQL can run filters like this one." Now you go to a different operation that does not fold. As a new step, it will actually immediately tell you, "Hey, this thing is no longer when I run in SQL, we're running it locally, we're compensating here, this is what's going to happen. In the extreme, this might actually cause you issues, click here to learn more, learn some best practices for how you could do things be different." And many other features that we're working on there. This is the most basic way to tell the most basic end user about, "Hey, there might be a problem," is like the engine in your car dashboard.

Miguel Llopis (00:50:33): We're also looking at things like query plans, more detail, deeper information for slightly more advanced users who actually understand the underlying SQL, the underlying code behind it, to go reason about okay, where are actually things going south? How do I understand this better? So I answered the easy part of the question, which is how do we make it clear that there's a problem? Now Sid can talk about the exciting stuff we're doing on the scale.

Sid Jayadevan (00:50:56): And I'll have, I guess, an unsatisfying, cryptic answer to that, because it's probably our largest area of investment right now. But we don't have anything that we can really announce yet, but it's something we're working on. And that should come as no surprise, because we hear a lot of feedback in the space. And there's a lot going on at Microsoft and in the industry in this space around making compute more available, even if your data lives somewhere where there isn't compute. And so that's something that we will definitely be investing in and that we're actively working on.

Rob Collie (00:51:32): I actually find that to be a very satisfying answer. Because honestly, all I want to know is that A, people are working on it, and B, there's optimism, there's still improvements to be made. That's all I really need to know. I mean, there's a nerd part of me, it's like, "Okay, come on. How do we do it?" But even then probably, if I got too close to it, I probably go, "Oh, yeah, now we're on board." I'm really just interested in the fact that it's going to happen. Have either of you seen all of these, a meme, but it's a YouTube meme, a format that's Hitler losing his cool, screaming at his generals in the bunker, and the subtitles have been replaced with something completely different? You seen these?

Sid Jayadevan (00:52:15): Yep, seen those.

Miguel Llopis (00:52:16): I've not, I'm too young for that.

Rob Collie (00:52:18): Oh, really? YouTube didn't go and record Hitler in his bunker. I don't know if you know that YouTube is relatively recent invention that probably has happened in your lifetime.

Miguel Llopis (00:52:30): But I just don't have time to watch it. There's just so many soccer games to go watch. Sorry, football games.

Rob Collie (00:52:36): Football games. I agree. So I made one of those a long time ago, making fun of Tableau. And in terms of the first three months of its existence it's probably the video that's been watched the most of all the things I've ever done on YouTube. There's a part at the end where he mutters under his breath, he turns to look at his subordinates and say something like, "And if you think we're paying for those Alteryx license, you better be sprucing up your LinkedIn." So as the Power Query folks, I made that joke for you.

Miguel Llopis (00:53:09): [crosstalk 00:53:09] Geek.

Rob Collie (00:53:10): There's a lot of inside baseball in that video. Even the Tableau employees that have seen it, look at me say, "Okay, that actually was pretty funny." What's next? What am I not asked about? Such an exciting space with so many opportunities.

Sid Jayadevan (00:53:26): We have a whole new interface coming in terms of a more diagrammatic visual representation of the queries. Miguel may have alluded to this before. That's a big one, changes the profile of the product quite a bit. We're not taking anything away. And we don't want to make things tricky for people who are familiar with the existing interface so it's strictly additive. But that's one we're really excited about. We've tried a few things there. It is a new interface, but we're also using it as a way to address some of the feedback that people had, just surround how you track relationships and make it a little more fluid to chain things together. So that's one that I think the team's very excited about. So we're going to push that one out pretty soon. And that's already in preview, so you can go play with it.

Rob Collie (00:54:19): I think I should as one of the absolute sloppiest designers of Power Query scripts in the world. If you ever want examples of really, really, really ugly, I can't believe this Rube Goldberg sequence that someone's written, all you need to do is just ask me for anything that I've done. I've got stuff now that I'm just like, "Okay." I've got four queries that are basically one to one linearly feeding into each other, that their only purpose is to feed the next one. And they're not even sorted in the proper order in the query pane. Even I don't remember which one is the root, I don't remember which one is the first one in the assembly line. Every time I go back and look at I have to re sort of trace, trace, trace, trace. Like, "Okay, that's right. That's how this thing works." You want ugly, I got you covered.

Miguel Llopis (00:55:06): Yeah, we would love to see those and see what we're reinvesting and actually behaves against that. So yeah, I just sent you a link on the chat window for the Diagram View, would love your feedback on that. Let's you share feedback about every other area. And again, we will take it and we'll generalize it, and we'll make it into something that improves the product.

Rob Collie (00:55:26): And this poor high school football coach, his first exposure to Power Query is with exactly the example I just told you about. He has no idea how much better it can be.

Sid Jayadevan (00:55:37): That's cool. You know that ad hoc style of using the product where you don't necessarily architect how your queries come together, in some ways that we want to cater to that even more, we don't want to go in the direction of some very formal modeling exercise. And we want to keep enabling that style of using the product. And so this tool is not meant to police any of that, it's more just to help you understand it better. I have some mashups where I have so many queries, and they could have been designed way, way better. And over the passage of time, a few months later, I look at the thing, and I have no clue what I did, back when I created it. So this is meant to help with those sorts of things. It's more to help you decipher what others did, and sometimes help you decipher what you did.

Rob Collie (00:56:29): A previous version of you is almost just as inscrutable as another person's work. I can be away from it for two days and come back and go, "What was I doing here?" Same is true, by the way, with spreadsheets, traditional spreadsheets, non DAX spreadsheets. I can generally go back to one of my DAX models, and pretty quickly get back into the personality of what I was doing there. But the old spreadsheets, using just the Excel formula language, and really pushing it to its limit, oh my gosh, those things. I'm always impressed at how smart I must have been in the past to have put one of those together. The current version of me always feels dumber than whatever the version was, that was able to do what I did in Excel back in the day. So, connectors?

Miguel Llopis (00:57:14): There's a roadmap on connectors, there's some... Overall, our strategy with connectors is one where we have the custom connectors as the key that empowers anyone to build connectors. This could be you building your own connector for whatever you're trying to do. Or this could be an actual ISB company who owns an underlying data source backend, who actually wants to provide connectivity to that from Power BI from Excel. And we do have certification programs around that.

Miguel Llopis (00:57:42): So really, there's some new connectors coming out of our team, there isn't much in terms of net new connectors. There's a bunch of connectors that our team owns from the early days when we didn't have this way to actually extend our SDK. And there is where you see most of our investments on making sure that X connector now can use this certain new feature that the underlying back end added or that customers are demanding now, more than others.

Miguel Llopis (00:58:09): So I wouldn't say there's much in terms of excitement there on connectors to cover at this level, is very point wise feature level things on existing stuff, does have experiences, Power Query experiences. So yeah, everything you want to talk about regarding diagram views, or more by example, like experiences infuse AI into the product. On the data flows prompt, we talked a little bit about the refresh base data quality and monitoring stuff, which is actually not formally in our public roadmap. But just because I think the discussion we have, it just screams at, hey, this is actually a useful area that I think is just okay. Those are kind of the big pillars.

Rob Collie (00:58:47): There was something interesting that as you were talking about the connectors. Of course, Microsoft cannot write connectors for everything, the list of everything is damn near infinite. And reasonable percentage of the time, the systems that I wish there was a connector for is a non Microsoft product that, at best is sort of neutral towards integration with Microsoft technology and other times it's openly hostile to it. And so, at our company, of course, we're Microsoft at its core, we use a lot more Microsoft than other stuff.

Miguel Llopis (00:59:21): Come on, you don't need to apologize, what else are you using?

Rob Collie (00:59:24): We use a lot of things. And so like Salesforce is our CRM, and in terms of workflow, it's one of our most central systems. Certainly not our only system. Right off the bat, we've got an alien right in the middle of the story. And we park a lot of data for our own internal BI. And by the way, our internal BI is very, very, very sophisticated today, it's not a stretch to say that we simply could not survive without it. It's not like our business operates and then we use BI to optimize it, it is life support. It's the oxygen supply, it is really, really, really, really critical to us and our business model.

Rob Collie (01:00:07): So we use another third party product called Stitch, which you've probably heard of that they've written a bunch of connectors essentially, will then dump data into various endpoints that they know about. And so we get a lot of data out of our core systems into the Azure Data Warehouse, so not just Azure SQL, via Stitch, and then Power Query kicks in. It's not like, it just lands there. Gosh, our Google AdWords data, we're grabbing that from Stitch into Azure Data Warehouse.

Rob Collie (01:00:40): And then because the data that's grabbed... It's so weird, guys. AdWords data is like day to date running totals. So every time you take a snapshot of it, it's like you had three clicks last hour, now you have seven clicks. Does that mean you have 10 clicks today? No, you have seven. So we've got Power Query that is doing a group by and taking the max, or grouping by the most recent timestamp on that day, because Stitch doesn't do anything magical for us, all it does is just raw data dump from one place to another.

Rob Collie (01:01:14): It's really neat like this, going back to that metaphor of, you want your jet plane built for the reality of the world, with all kinds of noise, and all kinds of variety, and all kinds of unpredictable things. And even without dedicated Power BI connectors, for a lot of our systems, it doesn't matter. We're going to get that data. And the fact that the Microsoft tools participate in this larger ecosystem.

Rob Collie (01:01:39): I've always been really ambitious about defining sort of the new template for what consulting firms should look like in this new world. It sounds like a cliche, but 11 years ago, I'm sitting in my office one day in Cleveland, and I was using Power Pivot and it just hit me like a thunderbolt. I'd suddenly done something that was not possible. And I'd done it in a space of like an hour that had taken weeks and weeks in the previous world. And I could kind of see that the world was going to change, that the size and duration of a typical project was going to shrink dramatically, still have the same amount of impact as the big long project, in fact, actually better. It's going to have more impact because the short projects means that you're actually holding people's attention long enough to iterate and get the real results that the longer projects never got to because people got too exhausted, and just called it done even when it wasn't. So the size of the average project was going to compress dramatically.

Rob Collie (01:02:37): So the utilization model for a traditional consulting firm, which has long been like park a handful of people on a six month minimum project. That whole business model was going to die. Now I thought it was going to happen a lot faster than it has, it still hasn't happened, really. We've reached the point where the world is intellectually, agreed that citizen developer model is primary, and is important. But for a long time, that was still heresy. So we've reached the point, we've intellectually accepted that.

Rob Collie (01:03:06): But that doesn't mean that the real on the ground muscle memory has changed. This has been the mission for 11 years, go and build this firm. However, we never took any investment. It's not like really people who would ever want to fund a consulting startup anyway. Angel investors and venture capitalists, they're always looking for tremendous intellectual property. They don't want people involved. Consulting firm has too many people, it's too good of a deal for too many people. They want something where you can essentially charge rent when it's done. So we wouldn't have really been able to attract that kind of funding anyway. Plus, they would have ruined it if we had taken their money.

Rob Collie (01:03:40): So we've organically grown, all of our hires, and all of our growth has been funded out of revenue, which makes it slow or slower anyway. But someone told me something a long time ago, which is like, "Let me tell you about my 10 year overnight success. It's not overnight, but it has been 10 years." And I'll be completely honest with you, I think that there's really no limit to how large we can be. It's been a long road, but the way we operate is to run with these tools as fast and as impactfully as they allow. So we're great for the customer. We're great for the customer in a way that I don't think really any other Microsoft partner is. It's a very hard business model. It's obviously the thing that the customer needs. But it's a hard business model to sustain which by the way, we've used the Microsoft platform to make it work internally.

Miguel Llopis (01:04:32): So you're ceiling, your bottleneck is actually going to be at the very least talent acquisition so that you can scale to more people as you scale to more customers.

Rob Collie (01:04:41): Yeah. I learned a lot of things at Microsoft about interviewing too. And we're using a lot of systems, we have a lot of actual both software and delegation to assistants and things like that, that allow us to scale. The hard lessons that I learned about interviewing at Microsoft, we apply that at national scale. So we have like a 2% offer rate for our candidates, and we get to pick the best of the best. So I actually don't think we have a supply bottleneck either.

Sid Jayadevan (01:05:13): That's a lot of interviewing.

Miguel Llopis (01:05:15): We're hiring. So if you have any pointers, we appreciate them too, both engineering as well as PM.

Rob Collie (01:05:21): Well, I don't know, that's the kind of consulting fee that I'm going to have to [crosstalk 01:05:27]. I'm going to have to have McKinsey white label me, so that I can charge Microsoft the millions of dollars that they would pay McKinsey, but they would never pay Rob Collie. Definitely, yeah.

Sid Jayadevan (01:05:42): Very interesting. Fascinating story. I mean, I've watched from afar, but I didn't know many of these details. And so yeah, very helpful.

Rob Collie (01:05:52): The engineering mindset, I think both of you actually would be really sincerely kind of interested and fascinated by all the things that we've developed and found about how to incentivize the right things with our consultants, for our clients. And there's something almost, it's not patentable, it's not protectable. But there is something in the same way that software has intellectual property, our system are all up system, software, people workflow, all of that. I'm pretty sure this is the only instance in the world like it of a company that operates like this. We've had to discover how to do this rather than there was no template to follow. You guys both know how exciting that kind of problem is. The same sorts of things that get you geeked up about going to work at Microsoft to solve that performance problem or whatever we're talking about, that same itch being scratched, but in a different plane. No, you can't have any of my people, Miguel.

Miguel Llopis (01:06:48): Good to know.

Sid Jayadevan (01:06:50): And have you been geographically distributed throughout?

Rob Collie (01:06:53): Yeah. It was really like 2015 was the first time that I realized I was bringing in the demand for work was exceeding my personal capacity to address it. And it was just me running the website and doing the trainings and doing the consulting up until that point, basically. And I knew that I didn't have time to train up another consultant that could do the work that I was doing, I needed to find someone who was basically ready today. And I knew that I wasn't going to be able to do that, if I was just like, "Let's just find someone in the vicinity of where I currently live."

Rob Collie (01:07:30): So the very, very, very first candidates, the very, very first interviewing that we did as a company was remote. And the first few people to pass this interview, which again, was designed 100% from my experience interviewing program managers at Microsoft, and especially the fact that I've done it the wrong way for years and then I did it sort of the right way for the last third of my career. The first people to pass it were in all over the country. They were in Oregon, they were in Iowa, they were in Iowa, they were in Alabama, and I was in Ohio.

Rob Collie (01:08:05): So it's actually something that's really interesting. And I'm almost a little bit bummed about the fact that COVID has rewired everybody this way. Because for a while, I think we'll still have this advantage for a long time in a way, but especially given the nature of the consulting industry that we're in, which is still very in person. When you can hire from any geography, you can afford to be a lot more selective. You just have a bigger denominator. If you want to hold a really, really high quality bar and clear it, you can do that, if you're not geocentric. So in a way, we were kind of forced into behaving optimally from the beginning. It wasn't some fiendish genius plan, like, "Oh, we will be geo distributed, and we will therefore get the best talent, and bahaha." It wasn't like that at all. It's just like, "I need a person and there's no way that I'm going to find one in Cleveland." And all followed from there. So it's really insane. Heck of a journey.

Sid Jayadevan (01:09:09): It's an amazing success story. And sounds like you guys feel like you're just getting started.

Rob Collie (01:09:16): And trust me, plenty of failures along the way. I found out somewhere along the way that I actually am not good at running a business. And it's like you find out you're not good at driving a boat and the way you found out is I just crashed it onto a reef. I did that. I almost killed my own baby at one point. And I had to realize that I needed to share the steering wheel. And so the guy whose podcast went live this week Kellan, he's the architect of almost all of these good things I've been talking about. My vision and the things that I wanted to have happen, never ever would have met reality without Kellan to bring them to life.

Rob Collie (01:09:54): And he was one of the first people to pass the interview. I hired him as a consultant originally. I had no idea that I was hiring my other half at the time. It took a long time for me to come to terms with that. So hard road, lots of humbling, really humbling experiences. Well, guys, I'm sincerely grateful to be able to grab a couple hours of your time. Thanks for doing it.

Miguel Llopis (01:10:16): Thanks.

Sid Jayadevan (01:10:16): Thanks for having us.

Announcer (01:10:18): Thanks for listening to the Raw Data by P3 Podcast. Find out what the experts at P3 can do for your business. Go to powerpivotpro.com. Interested in becoming a guest on the show, email lukep@powerpivotpro.com. Have a data day.

  continue reading

144 эпизодов

Artwork
iconПоделиться
 
Manage episode 287525082 series 2798195
Контент предоставлен Rob Collie and P3 Adaptive. Весь контент подкастов, включая выпуски, графику и описания подкастов, загружается и предоставляется непосредственно Rob Collie and P3 Adaptive или его партнером по платформе подкастов. Если вы считаете, что кто-то использует вашу работу, защищенную авторским правом, без вашего разрешения, вы можете выполнить процедуру, описанную здесь https://ru.player.fm/legal.

It was an honor to sit down with two of the creators of Power Query, Sid Jayadevan and Miguel Llopis. You get to hear the history of Power Query from their perspective! These guys are so busy it took quite a bit of coordination to get them together to record this episode. We hope you enjoy it as much as we did! References In This Episode:

Reese's Peanut Butter Cups Old School Commercial

Office Space Jump To Conclusions Hitler Hits A Breaking Point With Tableau PQ Diagram View (Preview)

Episode Timeline:

  • 4:45 - The Origin of Power Query, and the importance of M
  • 19:00 - How the various Microsoft departments come together for a big project
  • 30:15 - The many uses of PQ and the Power Platform, error handling in PQ, and some solutions to some common problems
  • 53:30 - What's next for PQ and Dataflows

Episode Transcript:

Rob Collie (00:00:00): Hello Friends, this week's guests are Miguel and Sid, the data integrations team at Microsoft. They were both around and involved when Power Query essentially emerged from the primordial ooze at Microsoft. And that's what this week's episode is mostly about, which is, what is the Power Query origin story? I've long been fascinated by that question as an outsider and observer, since this all happened after I left Microsoft. It's kind of hard to imagine a time when we didn't have Power Query. But there was actually three or four years in there, when we were building DAX data models for clients without benefit of Power Query. And this is where I use my old man voice. Kids these days, they have no idea how easy they have it, back in our day, we didn't have fancy Power Query, you just had to cobble the data together by hand. And so I'm not just fascinated by the origin story here, I'm actually deeply impressed and appreciative, especially as a former software engineer, who knows how challenging it is, just how gorgeous Power Query really is.

Rob Collie (00:01:06): And because of that, along those lines, during this conversation, I kept trying to get these two gentlemen to take the victory lap. They didn't take that bait, too humble, too cognizant of the work yet to be done, which of course, is really how we would want it, isn't it? Because seriously, it's good to know that there are people like this at Microsoft, who are stewards, really, of our futures. They're the ones building, not just the tools that we already use, but the tools that we're going to use in the future. There are a few places in here where we geek out a little bit as computer scientists, but mostly, the conversation stayed very firmly rooted in the human beings, the human element again, which is what it's all about. I'm sincerely honored that they took two hours out of their busy schedules to spend speaking with me and speaking to you on this podcast. I hope you enjoy it. I hope you learn something from it. So let's get into it.

Announcer (00:02:03): Ladies and gentlemen, may I have your attention, please.

Announcer (00:02:07): This is the Raw Data by P3 Podcast, with your host, Rob Collie. Find out what the experts at P3 can do for your business, go to powerpivotpro.com. Raw Data by P3 is data with the human element.

Rob Collie (00:02:23): Welcome to the show, Sid and Miguel, how are you today fine gentlemen?

Sid Jayadevan (00:02:27): Very well, thank you. Many thanks for having us.

Miguel Llopis (00:02:30): Yeah, doing great, Rob. Thank you so much for having us, it's a pleasure.

Rob Collie (00:02:33): Seriously, to get a hold of the two of you, and coordinate calendars and make this happen, it's an honor for us. So this was something that almost from the moment we launched the podcast, asking members of our team at P3, what would be interesting topics. One of them that just sort of keeps insistently coming up is where did Power Query come from? And so I was on this mission to hunt down some of the people who could speak to that. And there's a lot of things we can talk about. What are your roles today at Microsoft?

Miguel Llopis (00:03:06): Sid, do you want to go first?

Sid Jayadevan (00:03:07): Yep. So I am an engineering manager at Microsoft Manage, the team that works on data integration, with Power Query being one of the key elements, but we have a variety of other things we do with connectors, gateways, data flows, all of which are very symbiotic. Power Query depends on all of those things, and all of those things depend on Power Query. So there's a larger space that we operate within, but Power Query has been around the longest of all of those things.

Rob Collie (00:03:39): So I didn't know this going in. But I have some complaints about this QuickBooks connector. I'm sure you know what those complaints are too. They will not be ones that you've heard the first time from me making a note of that.

Sid Jayadevan (00:03:55): No points for guessing.

Rob Collie (00:03:56): No points. Okay. Miguel, what are your responsibilities today?

Miguel Llopis (00:03:59): I'm the program manager lead for Power Query, and connectors and data flows. So a bunch of technologies and experiences that Sid was talking about. Been up on the team for quite a while, maybe about the same as Sid, maybe a bit less. He's older than me, so that would show through this interview.

Rob Collie (00:04:17): Oh, yeah. Well, we're only recording the audio here, you can't really tell.

Miguel Llopis (00:04:22): I meant from his wisdom, not from the look.

Rob Collie (00:04:24): Oh, I see.

Sid Jayadevan (00:04:26): That's Miguel's way of saying I'm Palaeolithic.

Miguel Llopis (00:04:30): There you go.

Rob Collie (00:04:30): Oh, yeah. I see. You're also a free range gluten free?

Miguel Llopis (00:04:35): Of course.

Rob Collie (00:04:39): So you both go back aways on Power Query specifically. So from my perspective, from outside the company, I left Redmond in 2009. And then I sort of formerly left Microsoft in February of 2010. So I've been formally gone for 11 years, and sort of informally gone for a little longer than that. It's really kind of hard for me to even get back into the before mindset, when we had DAX, and data modeling, and we had the VertiPaq Engine, they're all still so incredibly central to Power BI today. And we had them in the Power Pivot form, we didn't even have SSAS Tabular yet.

Rob Collie (00:05:16): But we had nothing in this giant void that Power Query came along, to me, basically, out of the blue. I had no advance notice of this. My sources in Redmond, my spies, they hadn't hooked me up with the information that something amazing was coming to come along and complement the tools. There are so many times, guys, seriously so many times where I or we would be working with a client and we would know what the ideal data model would look like. But they couldn't do it because the data that they were getting wasn't in the right format to build the right data model. Even something as simple as you need a lookup table or dimension table, and no one's giving it to you.

Rob Collie (00:05:58): So it was always like, "Oh, now you need to go find a DBA," that you're either lucky or you're not, you either had one or you didn't. And if you were unlucky, you were just out of luck, there was no recourse. You were just going to have to make these changes to these tables manually. There was no automatic refresh anymore. It was really a tremendous limitation. And then suddenly, and at the time when it first arrived, what was it? The first name? Was it Data Explorer?

Miguel Llopis (00:06:27): Yes, Data Explorer.

Rob Collie (00:06:30): Even that initial name sort of conveyed a different mission than what it sort of morphed into. I don't even really remember. It was also a way to connect to lots of different things, wasn't it? And Power Query is today. Were you around when Data Explorer, when that name was chosen? What was behind that?

Miguel Llopis (00:06:46): Yeah, that would be an interesting story. So actually, just going one step before that, and I think we're getting towards 2012 at this point, I think. The very first incarnation of what today's Power Query in market was something called SQL Azure lab for data exploration. It didn't even have an actual product name, we're going to call it that way. It was SQL Azure Labs was a set of initiatives across the SQL and national teams back in the day to actually a spike different sets of technologies that will help in different segments, like data exploration will be one, data visualization. There were a bunch of things that came out that way.

Miguel Llopis (00:07:22): It was actually a full cloud based Power Query like experience to actually connect to data, transform data, and then output data in different ways. There's ways to actually get your data out as an all data endpoint that you could use maybe to create an app, maybe to consume from Power Pivot, to your point drop. The feedback back then, and again 2012 was, hey, Pivot is working in Excel, we want these experiences in Excel. So we repivoted all of that, no pun intended with repivoting, towards a client base experience that was actually an Excel add in that we initially released for Excel 2010 and 2013.

Miguel Llopis (00:07:59): There was quite a bit of naming related discussions, but I think the data exploration aspect, everyone got used to it. So we ended up coming up with Microsoft codename Data Explorer for Excel. That was a very first name for the Excel add in, which later than that, a few months later, as we went into GA, they actually got renamed to Power Query. And really the alignment there was with the power family, the power tools, Power Pivot, Power View, Power Maps back in the day as well. And then Power Query as a way to actually bring data in. Maybe Sid remembers some of these discussions more than me. I know there was also an alternative option, which was Power Import, that really, we went for the term query because it really reinforced the notion of repeatability and refresh ability of those queries, no pun intended.

Sid Jayadevan (00:08:45): I think we tested many, many terms, and import was just one of them. But we felt like the essence of the product was that ability to query ad hoc and at will, and so we really wanted to focus on that. And so that's how we ended up with the query piece of it.

Rob Collie (00:09:03): I think Power Query was a great name. Honestly, it didn't really land for me what y'all were giving us when it was still called Data Explorer. That name was actually a very large cognitive obstacle for me, explorer, it sounds like an analysis tool. Now, knowing where the roots were, you had a name, there's a more appropriate name for what you were doing, probably when it was still that Azure Lab thing. But it's so funny, this happens all the time, when your mission pivots, certain parts of it still kind of leak through, like the old name, by default.

Rob Collie (00:09:36): Just like the old story, whether it's true or not about the railroad tracks being the width that they are because the Romans chariot was that wide. It's just like, what we did yesterday is because we did it that way the day before and whatever. Certain things just have a momentum that carried forward. When the name changed to Power Query, suddenly, I was like, "Oh, okay, this is awesome." But it was Data Explorer, it was just a really cool novelty. I didn't have a sense of its purpose, or I didn't feel like it was a serious tool yet. It's really kind of interesting the power of naming, isn't it?

Sid Jayadevan (00:10:08): Neither did we, to a large extent, we were trying to find that identity. And I think as we got deeper into it, the query first aspect became a lot clearer.

Rob Collie (00:10:20): I also think the Power Query versus Power Import, I think the right decision was made there. The repeatability, you know this, but I'm going to say it anyway, if I only get five minutes with an Excel person who's never been exposed to the power platform, I have five minutes and I have to drop their jaw, I'm going to show him Power Query. I'm not going to show them the data model, I can show them DAX. Now, I think that ultimately, you absolutely need to be using both. But Power Query is such an amazing life changer for the Excel crowd and they can immediately appreciate what it's going to do for them.

Rob Collie (00:10:57): It's harder for them to appreciate what the data model is going to do for them. Which is why, if I've got five minutes, I don't go subtle. Isn't it amazing? You're talking about figuring out your own mission over time, that was something amazing. It sounds like from what you're saying that the M language, and the engine that goes with it, and all the stuff that's really difficult to build and to design, a lot of that was already kind of done before the Excel crowd became a focus. Is that true?

Sid Jayadevan (00:11:28): Yeah, that's true. Power Query is essentially a visual interface on top of the M language. The M language is absolutely the essence of the product, it's the foundation. And that foundation was built before the product as was often necessary, you need the foundation in place. And there is a long history around M that predates 2012, by let's just say, several years. And we won't delve into all the details, but much like we had to get clear about what Power Query was for and who we were targeting, there was a similar process with M. M was from a technology point of view, this very simple, yet powerful thing, in that it was functional, it composed, in our opinion, reasonably well. And so you could do lots of different things with it. But we wanted to put it in the hands of lots and lots of people who didn't necessarily have not even a programming background, but necessarily a query background.

Sid Jayadevan (00:12:29): And so that was the goal that we set for ourselves to bring all of those people on board, to make things possible for them that perhaps were a little harder in the past. And so M was really the foundation, and it was well in place CIRCA 2012, we made some changes to make it more friendly to the visual experience, to make it just a little more designer tool friendly, if you will. But the core of the language was already in place. And on the language front, we had tried lots of different things. And many, many people at Microsoft were involved in that effort at many stages.

Rob Collie (00:13:08): Something you said there really struck me and I wouldn't have thought about it this way until I heard the history. You made some changes, just some almost cosmetic changes to the language to make it more friendly to the visual composer aspect of Power Query. As soon as you said that, I'm like, "Oh, my god, yeah, this tool is super, super, super friendly to being edited and written from a graphical tool." A lot of times, you can go back into the M code and hand edit it and the visual editor is still completely okay. It totally understands what you did. That sort of round trip of hand editing and visual composer, exposing both to the user and still having a language and a tool that survives that duality, that's a challenge. That's a really big challenge. I've tried it multiple times at Microsoft. And I think I'm old for a lifetime on having designed a system that worked like that. So I can certainly appreciate it from a software engineering perspective, even just that one little detail of a language that was already pre built. That's kind of amazing.

Sid Jayadevan (00:14:08): We're still working on it, it's very much a work in progress. And there are aspects that are a little more, what should I say, language oriented, that remain an M that remain extremely powerful that the visual interface doesn't leverage quite as much.

Rob Collie (00:14:25): I've been talking to Miguel about this. And it's not just a deep power, but I think the Power Query transform, there should be many more tabs, because the language is so flexible. And I know that not everything can be turned into visual, some things are just absolutely going to forever remain 100% in the realm of I have to hand edit the M. I'm really nothing but a fan here.

Rob Collie (00:14:45): I told Miguel on a previous call that back in the early 2000s, when I was on the Excel team, and I caught the XML bug. The first feature set that I was a lead program manager for was the XML import and export capabilities. Not the XML file format, but data payloads, invoices or whatever, be able to move those in and out of Excel. And I spent two, four years of my life chasing a dream that I call Data Merge. There are huge, elaborate graphical mock ups of all of this, it was a really ambitious. It's exactly the kind of project you'd expect a young software engineer to get all amped up about and geeked out about.

Rob Collie (00:15:27): The thing I thought we needed to do was build some sort of repeatable data transformation logic into Excel. And I tried so hard to get budget, to get approval, to get greenlit to build a team to do this. And I got shot down four times a year for multiple years, every quarter, I make another run at it, like, "Come on, let's do this." Now that I've seen what you built, I am so glad that I never got approval to dive into that. Because once I've seen what it looks, a complete solution to it, I realized, oh my gosh, we were so overmatched. We would have never, ever succeeded, never come close. The fact that you had to go build a language first, makes sense to me now in hindsight, but it's really chilling. Oh my gosh, imagine they let me do my passion project. Thank you, Richard McAniff for never believing in me.

Miguel Llopis (00:16:24): Well, I'm sure you-

Sid Jayadevan (00:16:26): Yeah, not so sure about that, Rob. You might have done a lot better.

Rob Collie (00:16:30): I doubt it.

Sid Jayadevan (00:16:31): [crosstalk 00:16:31] Well, we'll never know.

Rob Collie (00:16:33): Well, I do, and I wouldn't have it.

Miguel Llopis (00:16:36): Well, Rob, I think you're being too hard on yourself. We don't have answers to everything. I assume you would have just like us, just fail fast, learn fast, iterate, learn from customers, learn from you, Sid, and just refine and get better over time. Here we are 10 years later, or eight years later, and we still have a lot of things to improve on.

Rob Collie (00:16:56): It's not even really just about me, it's also that Office doesn't have the right culture, to do something like what you've done. We would have gone and tried to solve a handful of simple cases, that's what would've happened under scheduled pressure. And we would have gotten committed to a system that wasn't elegant at its core. And then we never would have been able to really scale it to address... Because you know how it is, if you address 99% of problems that people have, it doesn't matter, that 1% is still going to plague enough of their workflows, that is the difference between they can adopt your tool or not. You've really got to be complete. We would have never been complete enough. And I can say that with confidence, knowing myself at the time, and also knowing the culture that was around me. We would have never gone and done the right thing, we would have hacked it, and we would have paid the price. And it's not just a question of me not being up to the challenge, just organizationally, we weren't at the right place.

Rob Collie (00:17:48): Office has really leaned into Power Query, it's a core part of the Excel ribbon now, basically taken over the prime real estate. So I think they're absolutely in on Power Query, and they're absolutely in on the value that it brings. It's just that they're not the right place to have invented it. In the same way that Office wasn't the right place to invent DAX, it's just not what Office does, Office does other things.

Rob Collie (00:18:12): If I wanted to turn this around and say, the historical struggles of the data side of the house has been that there haven't been traditionally as good at user experiences as Office was. But that gap is really closing, that has become an engineering discipline on your side of the house that it really wasn't when I was there. I used to describe Microsoft as there were user teams and engine teams, and there was no such thing as a team that was both. Office was the user team and the data platform, they built engines. But the engine team couldn't build user experience and the user experience team couldn't build engines. And so I think that's changed a lot. And this is a great example of it.

Sid Jayadevan (00:18:48): I mean to the point about Office and Excel, one of the things that has been a little different with Power Query is that we've embraced the open source model, perhaps a little bit more than for other products like that. We have the Office team contributing very heavily in our code base, not all aspects. It's the Power Query team that drives the majority of changes, but the Excel team is very, very involved. In fact, if you look at a lot of the developments around Excel on the Mac, the Office team has contributed very heavily to that. And so that ability to have other teams come in and make changes and they've really been a poster child for this on the Excel side, that has helped build Power Query into more of an ecosystem even within Microsoft.

Rob Collie (00:19:38): I didn't know that actually, I really had no idea that there were Office engineers contributing code. I just sort of naively I guess, assumed it was a one way street. You guys were sending them a build update every now and then and they were ingesting it.

Sid Jayadevan (00:19:51): And that is fundamentally how it operates because we do want everyone within the larger Microsoft ecosystem to be benefiting from the same enhancements, so there is a build that goes out every month for Excel desktop. But we also have a lot of teams across Microsoft who are contributing in a fairly big way.

Rob Collie (00:20:11): In terms of, we talked about the M language and all of that. I just told you the story about never getting the chance to do Data Merge on the Office team. I'm really deeply curious about how the M language got greenlit, how did the need for it get recognized and bubbled up into something that they got resources. Because like I just said, it's such a crazy thing. If you haven't experienced the pain of the world, in terms of automatically munging and transforming data, if you haven't experienced it, and most people have it, even at Microsoft, most people haven't experienced that, trying to convey that pain to other people is very, very difficult. I look at Power Query as, look, this is something that the world needed, not just like a demographic, this is something had improved the world. And yet, I know from experience, it's very, very difficult to explain to people who are already on board, what the value is. Is there anything that we can talk about there?

Sid Jayadevan (00:21:06): Without getting into all of the details, we went through a number of iterations to get to where we are and where it started was with some precursors to M, which were more about modeling, what we set out to do. And there's a large number of people who contributed to this. And so some of this predates some of our contributions. I've been involved with the project on and off, in fact, I left at some point and came back to it. And a lot of the seeds of the project were in modeling related efforts. So ways of modeling your data, modeling your relational data model. And as I guess, in hindsight, could have been expected, folks started to realize that a lot of what you needed to do to have a successful modeling environment was enable transformations as a first class thing. And so at some point, you had a language that was a little bit of data modeling, and a little bit of transformations layered on top of that.

Sid Jayadevan (00:22:12): And frankly, over time, we talked about how the query thing became more and more important and became the essence of the product. The data modeling side of things faded to some extent, and the focus shifted towards transformation. And it shifted towards transformation of all data. There was a period, not just at Microsoft, but in the industry where very focused on data as not necessarily a silo but homogeneous data stores. And when that heterogeneity of data became a reality that no one was going to change, the focus of tools like ours, and languages like M shifted more towards that ability to embrace all kinds of data, wherever it might live, of course, change the language and give us what we have today.

Rob Collie (00:23:04): This will show how old I am, there used to be a series of commercials for Reese's Peanut Butter Cups, where two people would be walking along, one of them will be carrying a chocolate bar and one would be carrying an open jar of peanut butter for some inexplicable reason. And they'd bump into each other and the two would accidentally mix. And then they'd accuse each other, "You got your chocolate in my peanut butter." "No, no, you got your peanut butter on my chocolate." And then they would take a bite of it, go, "Oh, my God, this is the best thing." It kind of has that feel to it, doesn't it? The origins of M and Power Query, it's not like there was this anticipated union with some DAX and what we think of as the VertiPaq, Power Query data model. That wasn't a mission statement from the beginning, it's just these two things ended up going together super, super, super well, sort of an accidental union. That's been my sense of it forever. Is that true?

Sid Jayadevan (00:23:53): I think that's a very fair assessment. Miguel, what do you think?

Miguel Llopis (00:23:59): Yeah, I tend to agree. I was actually thinking about the previous comment you made about the heterogeneous nature of the data space right now. So yeah, really, when you talk about big data, it's not really only about the volume of data, there's also the variety of data, both in terms of the sources you connect to, the schemas they have, the different keys on either side, and the need to use things like fuzzy matching and mapping tables and whatnot.

Miguel Llopis (00:24:22): And then lastly, is also about the velocity of the data, there's some data that changes once a day, there's some data that changes once a quarter, there's some data that changes multiple times per second. And so providing tools for non technical users, which is the vast majority of people in the world to actually be able to do this efficiently and with ease and that even for somebody who can do the hard thing, of course, who wants to do the hard thing if you can do it much simple ways. I think that that was key to us and just democratizing this whole problem space and of course, there's a lot more that we can do.

Miguel Llopis (00:24:55): And thanks, Rob for your list of suggestions from the team. We love those and within our team, we do have this whole bucket of what we call customer law, which is about, "Hey, give us a problem that you've ever tried to solve with Power Query and it didn't actually make the cut for you. And I will try and generalize that and give you a feature out of it." That's how many of our existing transforms came about.

Rob Collie (00:25:17): It's just such a rich canvas. When you start from a language, you have a lot of future flexibility in what you can do, it's awesome. The heterogeneity thing again, I also really reacted to that, that speaks to me. Another thing that shows my age, I grew up during the peak Cold War between the US and Russia, or NATO and the Warsaw Pact, whatever. And so I read a lot of Tom Clancy and I was one of those kids.

Rob Collie (00:25:44): Something that really strikes me from that is that the two different philosophies of the 1970s, 1980s, Russian military strategy versus the United States is, you see it in every morning or every day of operations at a United States Air Force Base. Everybody at the Air Force Base, gets out in a big long line, and walks the entire length of the runway, picking up pebbles, and all kinds of foreign objects from the runway, because if any of that gets sucked into the intakes of these really sensitive airplanes there's going to be hell to pay, it's going to break it, it's going to go down. Whereas, the Russians built everything that they had, at least in theory to eat mud.

Rob Collie (00:26:23): I think the old world of BI was that 1980s American strategy. You had to have this absolute clean room. It's ideal, frictionless circumstances in order for everything to work right, which is, of course, it's completely unrealistic. The real world is dirty, it is noisy. There's chickens running across the runway, it's not just pebbles. Is there even a runway? And this wave of Microsoft tools, the Power BI beating heart, which Power Query is part of it, I mean, it is built for that real world messy, dirty reality. It's not the kind of thing that you imagine when you're sitting around in a whiteboard doing computer science. And when computer science can meet that kind of reality and perform, it's really something to behold. It's just a whole new era, isn't it?

Sid Jayadevan (00:27:16): Yeah, I couldn't agree more. It's messy and dealing with that messiness is still very much a work in progress. But that's the thing we're trying to embrace, that messiness that isn't going away anytime soon.

Rob Collie (00:27:29): It only gets messier, even our company.

Miguel Llopis (00:27:31): But at the same time tools get better and smarter. So how can we actually make it so that it's even easier and easier for you to do these things with Power Query in this case?

Rob Collie (00:27:41): Yeah. Some of the things that you can start to do with machine learning and AI to write the code for them, there's some scary stuff that can be done there. A no column by example is sort of the most straightforward poster child for that kind of thing. I do want to at least make one joke with you, which is actually the truth but it's funny, is that back when I would teach classes, we still teach a lot of classes, but they don't let me teach them anymore, because I'm not as good as the people on our team. But whenever I bring up M, and I would show people the code, and so we'd using Power Query a little bit, and then I'd show them the code that it was generating. And then I would zoom in on that code. And I'd say, "And this word here at the beginning, tells you everything you need to know about where this thing came from." The first word of every Power Query script being the word let, I just talked about like the messy real world reality.

Rob Collie (00:28:31): But the word let at the beginning of every Power Query script, tells you this came from the ivory tower. I look at the class and say, "It's almost like a philosopher smoking a pipe, who then says to you, 'Suppose.'" What if we pause it, and then the script starts? Like I said, I admire what M can do. But the M language itself doesn't speak to me in its raw form. I look at it and I kind of want nothing to do personally with editing it. A lot of people, especially on our team, they do, I'm just one of those people that's like, for whatever reason, I was willing and able to learn DAX and I typically don't learn stuff, I don't learn new tool sets, I don't learn new languages. The fact that I learned DAX is really an outlier for me. I'll never learn M, not in its raw form. I'm a button pusher, dyed in the wool.

Sid Jayadevan (00:29:25): And then we want to cater to all the constituencies, the folks on your team who wanted at the end, we want to make that possible. And for the many folks who would rather press the buttons, for that we have the visual interface.

Rob Collie (00:29:41): Do you have those personas behind the scenes where you talk about the person who only wants to push buttons, you have the unsophisticated user of M persona. Can we just name it Rob and I'll give you a picture of a me going...

Miguel Llopis (00:29:53): Actually, I call dibs on that one because I'm that kind of person as well. And that's what I would push for most of the time.

Rob Collie (00:30:00): Damn it. And you being on the team, you got an inside track to be the persona. All right. Well, listen, I'm waiting in the wings. I'll be your understudy. So how much of the two of you got an exposure to the next part of the chain, which is, do you sit around building Power BI models? Do you write DAX? Do you build data models?

Miguel Llopis (00:30:20): Yeah, big time. I mean, we use our tools, the tools that we build, we use them internally for for example, understanding how users are using our products or understanding our backgrounds and our feature tracking report, you name it. Not to talk about personal projects, I do have my personal projects with Power Query and Power BI as well for non work related stuff. And that's actually, in my experience of this is, to me has helped me the most actually understand internalize all of the end user pain points around this area and actually push the tool to actually become better. And I know Sid does quite a bit of this as well.

Sid Jayadevan (00:30:57): Yeah. The entire team does a large amount of eating our own dog food, dog food, and you've heard myself term for this. That's always been a very large part of what we've done. It's not just about using Power Query, it's about using in the context of all of the things that Power Query is hosted within. And so Power BI, of course, and Excel and Power Apps, and aspects of Azure, we try to ensure that we're experiencing the end to end experience as much as possible.

Rob Collie (00:31:29): It's just a complete divergence from the path we've been on. But I want to at least mention to you before I forget that, in the past seven days, last two work weeks, off and on I've been teaching a little bit of Power Query to a high school football coach. We're just kind of messing around for the moment with a Power BI through a pro bono project. It's just sort of a passion project of mine. I got to tell you, it's fun. This guy's eating it up. He's loving it. I'm showing him how to add error checking and things like that for when there's the temp Excel file still in the folder that he's trying to load from, that's going to mess things up. Well, you could filter that out and everything. And yeah, he sponging it up. It's just cool to see it. It's all these unexpected places, you see these tools end up being used.

Rob Collie (00:32:14): So both of you seem to have a lot of opportunity to sort of drive the race car that you build. And that was not something that I really felt like I had much chance to do when I was at Microsoft. It's like, we built race cars, we have no idea what it feels like to sit behind the wheel. And so it's always surprising to people that with whatever tool I've been working on, the customers were better at using it than I was. It's nice that there's a little bit more of a culture now of using the tools even for personal use. Personal use is fantastic, there's nothing better than personal use.

Sid Jayadevan (00:32:46): Absolutely. As Miguel mentioned earlier, it's for both hobbyist projects, pet personal projects, as well as internal day to day work. Love using it for all of those things. And Miguel in particular has I think some soccer things that he probably use it for, but I'll let him speak to that.

Miguel Llopis (00:33:10): Yeah, definitely soccer as well as a bunch of other things I wouldn't name. Yes, quite a few personal projects.

Rob Collie (00:33:17): It's really nice of you to call it soccer for us. I'm sure you don't call it soccer with your fellow soccer fans.

Miguel Llopis (00:33:24): Yeah. You mean with our football fans?

Rob Collie (00:33:24): Yeah. What are some of the craziest things you've seen? I'm sure that you've got just some really crazy stories of things that you've seen customers doing with Power Query that you never would have expected? Anything like that come to mind?

Miguel Llopis (00:33:37): Many things. So I guess could take crazy in a couple of dimensions. One could be unrealistic expectations on the tool or the technology. The other one could be tremendously complex projects. So I'll actually head down the second path.

Rob Collie (00:33:53): Sure. Let's do that.

Miguel Llopis (00:33:54): I think the biggest Excel workbook with PQ queries I've ever seen, had probably about 280, 290 queries on it. I'm glad we introduced query groups as a feature because that person will be there in the world without them. But even there, it's a pretty heavy to maintain project.

Rob Collie (00:34:13): And the dependency.

Miguel Llopis (00:34:14): Yeah, I was going to say understand query dependencies. So you do have some support for that in Excel today with query dependencies. We're working on way more interactive, highly visual experiences that eventually will make their way into Excel. But as of now available in the Power Query online experiences with what we call the Diagram View, which is currently in public preview.

Sid Jayadevan (00:34:35): 290 queries.

Miguel Llopis (00:34:36): Yep. And they're all legit, that we literally sat together and say, "Let's simplify this." And actually, yeah, it could have combined a few things, but it actually made sense the way he had it organized.

Rob Collie (00:34:47): And is the endpoint of that data landing in Excel?

Miguel Llopis (00:34:52): Yes, it was inside an Excel workbook.

Rob Collie (00:34:53): Wow. Wow. You don't have any examples of people using Power Query or data flows to automate their home? For example, I have a friend of mine right now, who is setting up using Power Automate, he's setting up where if he gets a text notification from a certain Internet of Things system, it will go in and adjust the temperature gauge, the thermostat, turn on heaters, turn on humidifiers, things like that. It's a terrarium, he needs to maintain the balance in this biosphere that he's built. And he's got monitors in there, but all they'll send them or text messages. That's all he can get. But he's like, "No problem, I'll eat those text messages and feed him into the power platform. And next thing, we're adjusting temperatures and humidity and all that kind of stuff." I bet there's a lot of stuff out there like that, it's data transformation but analysis isn't the endpoint. It's being used for something else.

Sid Jayadevan (00:35:54): We're blown away by a lot of the creativity, seen a lot of these very self regenerative programs that people have created, where the queries adapt and do all kinds of things. It's a ton of creativity.

Rob Collie (00:36:11): One scenario, and now we're doing the program manager feature design thing. And one scenario that I've wondered about for a while is failures in a Power Query, the error handling. Using the moment of error, harnessing that, and activating a human workflow to address it. The way you're nodding, this is not the first time this idea has come up, right?

Miguel Llopis (00:36:38): Yeah, I was wondering if I had mentioned some of that stuff to you. Because today, within the Power Query Editor experiences, you do get some help with data profiling features, you understand duplicate values, you understand errors. To some degree at least within the data in the preview that was for you to run that over the entire data set. But nothing really helps you with, after you save that, and you say, "Yeah, refresh this thing every day at 8:00 AM." With understanding if that still is correct, if you get a new outlier value, if you get a new duplicate value, and you get some errors around that. That's one of the areas that we're looking at. And it goes back to the thing we were talking about earlier about, how can we further simplify this tool and make it more productive for the real users of it on a day to day basis. And this is clearly one of those areas. I mean, if you're putting together a report or a dashboard for your boss, you want to make sure that they don't start looking at the wrong data without you even knowing.

Rob Collie (00:37:32): Oftentimes, it manifests in some very sinister ways. Like if a data source succeeds in refresh, but it feeds you back nothing but zeros. [crosstalk 00:37:42] There's no runtime error. And then, of course, if you saw a report with nothing but zeros on it, you'd notice, you say, "Oh, clearly, this thing's dead." But if those zeros are only one leg of a five leg platform that makes a single metric, the answers you get on your report can still be credible.

Miguel Llopis (00:38:00): Yes, that is a problem.

Rob Collie (00:38:03): And I'm speaking from experience, I've been burned by exactly this sort of thing in the past. Even when there's a runtime error, it's almost always a human being that has to go do something. If a duplicate key comes in, that wasn't there before, what do I do about that? I have to-

Miguel Llopis (00:38:19): Would it be nice if we just fix it for you? Or if we maybe ask you, "We saw an issue and this is what we think you might want to do." And we give you a couple of options. And maybe you don't even have to go to the tool, maybe there's a quick text message you get, maybe somebody is giving you a phone call while you're driving, maybe it's an email that comes in and just with a couple of clicks, you can just get it fixed.

Rob Collie (00:38:41): This are all good ideas. I like this. This sounds promising.

Sid Jayadevan (00:38:44): One thing that we recently added in this space was integration with power automate. So that's more on the data flow side. And it's early days for that, but we've already seen some very interesting solutions. One of the things you can now do is have your data flow include a bunch of these reports for issues that you mentioned, you could perhaps partition off the errors or have a bunch of litmus test queries that check the data quality. And if those queries start yielding results, you can fire a power automate that can engage whatever workflow makes the most sense for you. Whether it's sending an email, whether it's writing something out somewhere for someone to take action, going all the way to sending someone a text message. All of those things are possible. They're perhaps not as frictionless and out of the box as they could be, but we're making some of those things more possible.

Rob Collie (00:39:42): I think that problem of merging the automation with human like referees of the occasional error is probably as ambitious of a problem to address as Power Query was originally. I've got a lot of respect for that problem, placing myself in your shoes. Might not be that quite that ambitious, but it's a large problem. It's a product level problem to solve as opposed to a feature. Every now and then like, I get some data where someone keyed in an exclamation point instead of a one, because their shift key was down, and all hell breaks loose over that exclamation point.

Rob Collie (00:40:24): You got a hard job, the error tracking in your system, it's many levels deep. We all know the experience of you get the error, and the top 11 errors all say exactly the same thing. And you scroll through the list to get to the one at the bottom that tells you hopefully, what really happened before the downstream errors happened. It's hard to bubble up the right error to the right person at the right time when almost by definition, you don't know, you can't anticipate what this error is going to be, you have no idea what's going to come in. So I recognize this as sort of a frontier for you, but I do not mean to trivialize it at all. It's only an improvement. It's not like you need to do this otherwise, everything you've done is... No, you can stop today completely and Power Query is arguably complete, you just have so many places where you could-

Miguel Llopis (00:41:16): Go and tell that to Satya, we want to still keep our jobs. Got to find new challenges.

Rob Collie (00:41:20): Well, next time I talk to Satya, next time he calls me up for advice. Yeah, I think it would be a shame if you did stop. It's a compliment to what you've got, that if you stopped today, it's already well past amazing. I'd say to students and clients that there are two engines at Microsoft, two data engines in particular, that all of Microsoft's competitors wish they had it instead. What are you going to call the DAX and data model VertiPaq. Microsoft is not very good at naming, I don't know if you all know that. And then the other one is the M engine, the Power Query engine, which also by the way, goes nameless in all of your products. It's just get data or import or whatever now, getting transformed.

Miguel Llopis (00:42:03): It's the M engine and Power Query is the experience.

Rob Collie (00:42:06): These two engines, wherever you call them, they belong in the software Hall of Fame. I believe that. And this is a very vicious critic of software, who's talking to you right now. I hate software. And these two things, they demand your respect, it's got to feel good to have been involved in something like that from such an early stage. It's got to be one of the most gratifying sorts of experiences for a software engineer because most of the time, it's not like that.

Miguel Llopis (00:42:33): This is such a tough interview, Rob.

Rob Collie (00:42:36): To make you guys feel all gushy about yourselves.

Miguel Llopis (00:42:43): Yeah. Don't know what to say [crosstalk 00:42:44].

Sid Jayadevan (00:42:43): That's very kind of you.

Rob Collie (00:42:43): Oh, come on, you've lived it. Right? You've probably also lived as software engineers, you probably lived the other kind of project too. There's all kinds of dead ends in software that you can chase them for years.

Sid Jayadevan (00:42:54): I think one thing that's been a big differentiator with this one is, so Miguel and I are here today, but there's a team that has stuck together over an extended period of time. And it's the most fun I've had in my time at Microsoft. I'm very, very fortunate to work with those folks. For a problem like this, there is a kind of continuity that becomes necessary to... You talk about the iteration and needing to keep going. And we have a lot of work ahead of us.

Sid Jayadevan (00:43:24): But the thing that has made this easy and fun, at least from my point of view is the team has been phenomenal. You tend to have a lot of churn on teams, and you go through phases, and people come and go. But this has been one where there's a set of fun. And I'm not talking about a handful of folks, it's probably a few handfuls of folks who really pushed on this over many, many years. I think that's one thing that's been a little different vis-a-vis a lot of other projects, that there's been a set of folks who have stuck with it and have been incredibly passionate about it. And that's been a big part of Power Query.

Miguel Llopis (00:44:02): Completely agree.

Rob Collie (00:44:03): Some products really require that kind of continuity in order to continue being successful. Excel, by the way is one of them. I think Excel, I don't really know what it's like today, but when I was there, there was pretty healthy turnover every release on the Excel team. And the developers, the engineers, the actual writing the code, they had a bit more continuity, actually quite a bit more than the program managers. It was every two years, the school bus would drive up, all the program managers we get on, it would leave, new school bus arrives with younger program managers and would drop them off. I got off that bus one day and enjoying the Excel team. And the engineers on the team were just like, "Ah, the new youngsters, we got to train these people now too."

Rob Collie (00:44:55): It was a year and a half of working on Excel before I stopped coming up with feature ideas, like wouldn't it be cool if Excel could do this. It was a year and a half before I stopped coming up with ideas like that, where they'd look at me and say, "Yeah, we already have that." Honestly, I think that culture, that continuity was enforced more by a handful on the Excel team when I was there, they were keepers of the flame, if you will. And there was like one on the program management team, half a dozen on the dev team.

Sid Jayadevan (00:45:27): And you have a lot of those projects where you'll have one or two keepers of the flame. And I think what's been unusual with Power Query, at least compared to other projects I've been on is that there have been many, many keepers of the flame. And of course, you want fresh ideas. So you want people to be coming in and bringing those ideas, and we've had a lot of that as well. And so there's been keepers of the flame, there have been challengers of the flame in a very good way. So we've had that mix. But there has been a lot of good cohesion.

Rob Collie (00:45:59): It sounds a good title for a Kickstarter funded board game, challengers of the flame. I'll tell you what, well, you all get equal rights. We'll call it a common intellectual property, that name. I'm here by seizing 1/3 ownership in Challengers of the Flame, LLC.

Sid Jayadevan (00:46:19): What was the board game at the end of Office Space, the jump to conclusion board game?

Rob Collie (00:46:28): I don't actually remember, I've seen that movie so many times. Now, I've got an excuse to go watch it again. Tell my wife, "Listen, this is important. This is for work."

Sid Jayadevan (00:46:38): That was our quandary, what do you name the thing?

Rob Collie (00:46:43): So how much commonality is there, I'm assuming a lot, between data flows and the version of the M engine that lives in Power BI?

Miguel Llopis (00:46:55): Basically is the same engine. So data flows, the way I like to talk about this is layers of the onion. So if you think about the M engine as the core of the onion, then the next wrapper around that is the Power Query experience that allows you to create queries that run in M. Outer layer on top of that is really the data flows, which really automate and orchestrate many different sets Power Query projects that were defined with a Power Query experience to generate M that runs.

Miguel Llopis (00:47:23): So whereas you could have a data flow that maybe brings say, your customers data, your customers table. Or your customers entity, you may have another data flow that connects to that customers entity and then maybe does a bunch of additional Power Query and M query transformations to do your customers who are most likely to churn. And it's the orchestration of that whenever that customers table gets refreshed, cascade refresh everything else that depends on it. That is what data flows are.

Rob Collie (00:47:55): That makes sense to me. One of the challenges that I know that Power Query faces is that at tremendous scale, when the data is just gigantic volumes, the elapsed time of a query can get up there. And it's just an optimization thing. It's almost like the ideal software problem to have as engineers. How much progress has been made over the years? I haven't really been paying much attention to it. I just remember from the very early days, people saying, "Okay, it's great, but we can't use it for the 500 million row data set, going through a Power Query, just takes too long." Have there been any strides made? Again, I'm really sympathetic to this, it's a really hard problem. Power Query has to process every single row, it can't do the things like the VertiPaq Engine does where it sort of groups rows into clusters and treats them as one band of rows, you don't get those really nice columnar in memory tricks when you're performing transformations. So you're kind of up against physics in a way.

Miguel Llopis (00:48:56): Yeah, great point. So there's actually two avenues we can take to answer that question. I'm going to talk about both, I'll just call them out. And then I'll answer the easy one and I'll let Sid answer the hardest one. One is about increasing the scale of what you can process with Power Query. And of course, you need to do that. But on the other extreme, there's also the make it clear to the end user clicking those buttons as Drobo usually does, that there is a problem, and so that they can correct that problem before it actually becomes the root cause for things that are many, many steps further down the pipe.

Miguel Llopis (00:49:29): And so on this area on making things more clear to users, we're actually introducing quite a few new features. We just announced something called the step folding indicators. So it's a feature we recently launched inside Power Query online, inside data flows that as you connect to a data source, let's say, SQL Server, and you connect to the customers table, and then you apply a filter to maybe say exclude customers in the US, then you get your filter versus they will actually give you a tick next to it to say, "This has actually been pushed down to SQL because SQL can run filters like this one." Now you go to a different operation that does not fold. As a new step, it will actually immediately tell you, "Hey, this thing is no longer when I run in SQL, we're running it locally, we're compensating here, this is what's going to happen. In the extreme, this might actually cause you issues, click here to learn more, learn some best practices for how you could do things be different." And many other features that we're working on there. This is the most basic way to tell the most basic end user about, "Hey, there might be a problem," is like the engine in your car dashboard.

Miguel Llopis (00:50:33): We're also looking at things like query plans, more detail, deeper information for slightly more advanced users who actually understand the underlying SQL, the underlying code behind it, to go reason about okay, where are actually things going south? How do I understand this better? So I answered the easy part of the question, which is how do we make it clear that there's a problem? Now Sid can talk about the exciting stuff we're doing on the scale.

Sid Jayadevan (00:50:56): And I'll have, I guess, an unsatisfying, cryptic answer to that, because it's probably our largest area of investment right now. But we don't have anything that we can really announce yet, but it's something we're working on. And that should come as no surprise, because we hear a lot of feedback in the space. And there's a lot going on at Microsoft and in the industry in this space around making compute more available, even if your data lives somewhere where there isn't compute. And so that's something that we will definitely be investing in and that we're actively working on.

Rob Collie (00:51:32): I actually find that to be a very satisfying answer. Because honestly, all I want to know is that A, people are working on it, and B, there's optimism, there's still improvements to be made. That's all I really need to know. I mean, there's a nerd part of me, it's like, "Okay, come on. How do we do it?" But even then probably, if I got too close to it, I probably go, "Oh, yeah, now we're on board." I'm really just interested in the fact that it's going to happen. Have either of you seen all of these, a meme, but it's a YouTube meme, a format that's Hitler losing his cool, screaming at his generals in the bunker, and the subtitles have been replaced with something completely different? You seen these?

Sid Jayadevan (00:52:15): Yep, seen those.

Miguel Llopis (00:52:16): I've not, I'm too young for that.

Rob Collie (00:52:18): Oh, really? YouTube didn't go and record Hitler in his bunker. I don't know if you know that YouTube is relatively recent invention that probably has happened in your lifetime.

Miguel Llopis (00:52:30): But I just don't have time to watch it. There's just so many soccer games to go watch. Sorry, football games.

Rob Collie (00:52:36): Football games. I agree. So I made one of those a long time ago, making fun of Tableau. And in terms of the first three months of its existence it's probably the video that's been watched the most of all the things I've ever done on YouTube. There's a part at the end where he mutters under his breath, he turns to look at his subordinates and say something like, "And if you think we're paying for those Alteryx license, you better be sprucing up your LinkedIn." So as the Power Query folks, I made that joke for you.

Miguel Llopis (00:53:09): [crosstalk 00:53:09] Geek.

Rob Collie (00:53:10): There's a lot of inside baseball in that video. Even the Tableau employees that have seen it, look at me say, "Okay, that actually was pretty funny." What's next? What am I not asked about? Such an exciting space with so many opportunities.

Sid Jayadevan (00:53:26): We have a whole new interface coming in terms of a more diagrammatic visual representation of the queries. Miguel may have alluded to this before. That's a big one, changes the profile of the product quite a bit. We're not taking anything away. And we don't want to make things tricky for people who are familiar with the existing interface so it's strictly additive. But that's one we're really excited about. We've tried a few things there. It is a new interface, but we're also using it as a way to address some of the feedback that people had, just surround how you track relationships and make it a little more fluid to chain things together. So that's one that I think the team's very excited about. So we're going to push that one out pretty soon. And that's already in preview, so you can go play with it.

Rob Collie (00:54:19): I think I should as one of the absolute sloppiest designers of Power Query scripts in the world. If you ever want examples of really, really, really ugly, I can't believe this Rube Goldberg sequence that someone's written, all you need to do is just ask me for anything that I've done. I've got stuff now that I'm just like, "Okay." I've got four queries that are basically one to one linearly feeding into each other, that their only purpose is to feed the next one. And they're not even sorted in the proper order in the query pane. Even I don't remember which one is the root, I don't remember which one is the first one in the assembly line. Every time I go back and look at I have to re sort of trace, trace, trace, trace. Like, "Okay, that's right. That's how this thing works." You want ugly, I got you covered.

Miguel Llopis (00:55:06): Yeah, we would love to see those and see what we're reinvesting and actually behaves against that. So yeah, I just sent you a link on the chat window for the Diagram View, would love your feedback on that. Let's you share feedback about every other area. And again, we will take it and we'll generalize it, and we'll make it into something that improves the product.

Rob Collie (00:55:26): And this poor high school football coach, his first exposure to Power Query is with exactly the example I just told you about. He has no idea how much better it can be.

Sid Jayadevan (00:55:37): That's cool. You know that ad hoc style of using the product where you don't necessarily architect how your queries come together, in some ways that we want to cater to that even more, we don't want to go in the direction of some very formal modeling exercise. And we want to keep enabling that style of using the product. And so this tool is not meant to police any of that, it's more just to help you understand it better. I have some mashups where I have so many queries, and they could have been designed way, way better. And over the passage of time, a few months later, I look at the thing, and I have no clue what I did, back when I created it. So this is meant to help with those sorts of things. It's more to help you decipher what others did, and sometimes help you decipher what you did.

Rob Collie (00:56:29): A previous version of you is almost just as inscrutable as another person's work. I can be away from it for two days and come back and go, "What was I doing here?" Same is true, by the way, with spreadsheets, traditional spreadsheets, non DAX spreadsheets. I can generally go back to one of my DAX models, and pretty quickly get back into the personality of what I was doing there. But the old spreadsheets, using just the Excel formula language, and really pushing it to its limit, oh my gosh, those things. I'm always impressed at how smart I must have been in the past to have put one of those together. The current version of me always feels dumber than whatever the version was, that was able to do what I did in Excel back in the day. So, connectors?

Miguel Llopis (00:57:14): There's a roadmap on connectors, there's some... Overall, our strategy with connectors is one where we have the custom connectors as the key that empowers anyone to build connectors. This could be you building your own connector for whatever you're trying to do. Or this could be an actual ISB company who owns an underlying data source backend, who actually wants to provide connectivity to that from Power BI from Excel. And we do have certification programs around that.

Miguel Llopis (00:57:42): So really, there's some new connectors coming out of our team, there isn't much in terms of net new connectors. There's a bunch of connectors that our team owns from the early days when we didn't have this way to actually extend our SDK. And there is where you see most of our investments on making sure that X connector now can use this certain new feature that the underlying back end added or that customers are demanding now, more than others.

Miguel Llopis (00:58:09): So I wouldn't say there's much in terms of excitement there on connectors to cover at this level, is very point wise feature level things on existing stuff, does have experiences, Power Query experiences. So yeah, everything you want to talk about regarding diagram views, or more by example, like experiences infuse AI into the product. On the data flows prompt, we talked a little bit about the refresh base data quality and monitoring stuff, which is actually not formally in our public roadmap. But just because I think the discussion we have, it just screams at, hey, this is actually a useful area that I think is just okay. Those are kind of the big pillars.

Rob Collie (00:58:47): There was something interesting that as you were talking about the connectors. Of course, Microsoft cannot write connectors for everything, the list of everything is damn near infinite. And reasonable percentage of the time, the systems that I wish there was a connector for is a non Microsoft product that, at best is sort of neutral towards integration with Microsoft technology and other times it's openly hostile to it. And so, at our company, of course, we're Microsoft at its core, we use a lot more Microsoft than other stuff.

Miguel Llopis (00:59:21): Come on, you don't need to apologize, what else are you using?

Rob Collie (00:59:24): We use a lot of things. And so like Salesforce is our CRM, and in terms of workflow, it's one of our most central systems. Certainly not our only system. Right off the bat, we've got an alien right in the middle of the story. And we park a lot of data for our own internal BI. And by the way, our internal BI is very, very, very sophisticated today, it's not a stretch to say that we simply could not survive without it. It's not like our business operates and then we use BI to optimize it, it is life support. It's the oxygen supply, it is really, really, really, really critical to us and our business model.

Rob Collie (01:00:07): So we use another third party product called Stitch, which you've probably heard of that they've written a bunch of connectors essentially, will then dump data into various endpoints that they know about. And so we get a lot of data out of our core systems into the Azure Data Warehouse, so not just Azure SQL, via Stitch, and then Power Query kicks in. It's not like, it just lands there. Gosh, our Google AdWords data, we're grabbing that from Stitch into Azure Data Warehouse.

Rob Collie (01:00:40): And then because the data that's grabbed... It's so weird, guys. AdWords data is like day to date running totals. So every time you take a snapshot of it, it's like you had three clicks last hour, now you have seven clicks. Does that mean you have 10 clicks today? No, you have seven. So we've got Power Query that is doing a group by and taking the max, or grouping by the most recent timestamp on that day, because Stitch doesn't do anything magical for us, all it does is just raw data dump from one place to another.

Rob Collie (01:01:14): It's really neat like this, going back to that metaphor of, you want your jet plane built for the reality of the world, with all kinds of noise, and all kinds of variety, and all kinds of unpredictable things. And even without dedicated Power BI connectors, for a lot of our systems, it doesn't matter. We're going to get that data. And the fact that the Microsoft tools participate in this larger ecosystem.

Rob Collie (01:01:39): I've always been really ambitious about defining sort of the new template for what consulting firms should look like in this new world. It sounds like a cliche, but 11 years ago, I'm sitting in my office one day in Cleveland, and I was using Power Pivot and it just hit me like a thunderbolt. I'd suddenly done something that was not possible. And I'd done it in a space of like an hour that had taken weeks and weeks in the previous world. And I could kind of see that the world was going to change, that the size and duration of a typical project was going to shrink dramatically, still have the same amount of impact as the big long project, in fact, actually better. It's going to have more impact because the short projects means that you're actually holding people's attention long enough to iterate and get the real results that the longer projects never got to because people got too exhausted, and just called it done even when it wasn't. So the size of the average project was going to compress dramatically.

Rob Collie (01:02:37): So the utilization model for a traditional consulting firm, which has long been like park a handful of people on a six month minimum project. That whole business model was going to die. Now I thought it was going to happen a lot faster than it has, it still hasn't happened, really. We've reached the point where the world is intellectually, agreed that citizen developer model is primary, and is important. But for a long time, that was still heresy. So we've reached the point, we've intellectually accepted that.

Rob Collie (01:03:06): But that doesn't mean that the real on the ground muscle memory has changed. This has been the mission for 11 years, go and build this firm. However, we never took any investment. It's not like really people who would ever want to fund a consulting startup anyway. Angel investors and venture capitalists, they're always looking for tremendous intellectual property. They don't want people involved. Consulting firm has too many people, it's too good of a deal for too many people. They want something where you can essentially charge rent when it's done. So we wouldn't have really been able to attract that kind of funding anyway. Plus, they would have ruined it if we had taken their money.

Rob Collie (01:03:40): So we've organically grown, all of our hires, and all of our growth has been funded out of revenue, which makes it slow or slower anyway. But someone told me something a long time ago, which is like, "Let me tell you about my 10 year overnight success. It's not overnight, but it has been 10 years." And I'll be completely honest with you, I think that there's really no limit to how large we can be. It's been a long road, but the way we operate is to run with these tools as fast and as impactfully as they allow. So we're great for the customer. We're great for the customer in a way that I don't think really any other Microsoft partner is. It's a very hard business model. It's obviously the thing that the customer needs. But it's a hard business model to sustain which by the way, we've used the Microsoft platform to make it work internally.

Miguel Llopis (01:04:32): So you're ceiling, your bottleneck is actually going to be at the very least talent acquisition so that you can scale to more people as you scale to more customers.

Rob Collie (01:04:41): Yeah. I learned a lot of things at Microsoft about interviewing too. And we're using a lot of systems, we have a lot of actual both software and delegation to assistants and things like that, that allow us to scale. The hard lessons that I learned about interviewing at Microsoft, we apply that at national scale. So we have like a 2% offer rate for our candidates, and we get to pick the best of the best. So I actually don't think we have a supply bottleneck either.

Sid Jayadevan (01:05:13): That's a lot of interviewing.

Miguel Llopis (01:05:15): We're hiring. So if you have any pointers, we appreciate them too, both engineering as well as PM.

Rob Collie (01:05:21): Well, I don't know, that's the kind of consulting fee that I'm going to have to [crosstalk 01:05:27]. I'm going to have to have McKinsey white label me, so that I can charge Microsoft the millions of dollars that they would pay McKinsey, but they would never pay Rob Collie. Definitely, yeah.

Sid Jayadevan (01:05:42): Very interesting. Fascinating story. I mean, I've watched from afar, but I didn't know many of these details. And so yeah, very helpful.

Rob Collie (01:05:52): The engineering mindset, I think both of you actually would be really sincerely kind of interested and fascinated by all the things that we've developed and found about how to incentivize the right things with our consultants, for our clients. And there's something almost, it's not patentable, it's not protectable. But there is something in the same way that software has intellectual property, our system are all up system, software, people workflow, all of that. I'm pretty sure this is the only instance in the world like it of a company that operates like this. We've had to discover how to do this rather than there was no template to follow. You guys both know how exciting that kind of problem is. The same sorts of things that get you geeked up about going to work at Microsoft to solve that performance problem or whatever we're talking about, that same itch being scratched, but in a different plane. No, you can't have any of my people, Miguel.

Miguel Llopis (01:06:48): Good to know.

Sid Jayadevan (01:06:50): And have you been geographically distributed throughout?

Rob Collie (01:06:53): Yeah. It was really like 2015 was the first time that I realized I was bringing in the demand for work was exceeding my personal capacity to address it. And it was just me running the website and doing the trainings and doing the consulting up until that point, basically. And I knew that I didn't have time to train up another consultant that could do the work that I was doing, I needed to find someone who was basically ready today. And I knew that I wasn't going to be able to do that, if I was just like, "Let's just find someone in the vicinity of where I currently live."

Rob Collie (01:07:30): So the very, very, very first candidates, the very, very first interviewing that we did as a company was remote. And the first few people to pass this interview, which again, was designed 100% from my experience interviewing program managers at Microsoft, and especially the fact that I've done it the wrong way for years and then I did it sort of the right way for the last third of my career. The first people to pass it were in all over the country. They were in Oregon, they were in Iowa, they were in Iowa, they were in Alabama, and I was in Ohio.

Rob Collie (01:08:05): So it's actually something that's really interesting. And I'm almost a little bit bummed about the fact that COVID has rewired everybody this way. Because for a while, I think we'll still have this advantage for a long time in a way, but especially given the nature of the consulting industry that we're in, which is still very in person. When you can hire from any geography, you can afford to be a lot more selective. You just have a bigger denominator. If you want to hold a really, really high quality bar and clear it, you can do that, if you're not geocentric. So in a way, we were kind of forced into behaving optimally from the beginning. It wasn't some fiendish genius plan, like, "Oh, we will be geo distributed, and we will therefore get the best talent, and bahaha." It wasn't like that at all. It's just like, "I need a person and there's no way that I'm going to find one in Cleveland." And all followed from there. So it's really insane. Heck of a journey.

Sid Jayadevan (01:09:09): It's an amazing success story. And sounds like you guys feel like you're just getting started.

Rob Collie (01:09:16): And trust me, plenty of failures along the way. I found out somewhere along the way that I actually am not good at running a business. And it's like you find out you're not good at driving a boat and the way you found out is I just crashed it onto a reef. I did that. I almost killed my own baby at one point. And I had to realize that I needed to share the steering wheel. And so the guy whose podcast went live this week Kellan, he's the architect of almost all of these good things I've been talking about. My vision and the things that I wanted to have happen, never ever would have met reality without Kellan to bring them to life.

Rob Collie (01:09:54): And he was one of the first people to pass the interview. I hired him as a consultant originally. I had no idea that I was hiring my other half at the time. It took a long time for me to come to terms with that. So hard road, lots of humbling, really humbling experiences. Well, guys, I'm sincerely grateful to be able to grab a couple hours of your time. Thanks for doing it.

Miguel Llopis (01:10:16): Thanks.

Sid Jayadevan (01:10:16): Thanks for having us.

Announcer (01:10:18): Thanks for listening to the Raw Data by P3 Podcast. Find out what the experts at P3 can do for your business. Go to powerpivotpro.com. Interested in becoming a guest on the show, email lukep@powerpivotpro.com. Have a data day.

  continue reading

144 эпизодов

Все серии

×
 
Loading …

Добро пожаловать в Player FM!

Player FM сканирует Интернет в поисках высококачественных подкастов, чтобы вы могли наслаждаться ими прямо сейчас. Это лучшее приложение для подкастов, которое работает на Android, iPhone и веб-странице. Зарегистрируйтесь, чтобы синхронизировать подписки на разных устройствах.

 

Краткое руководство