From data to decisions: The future of OSINT

Media Thumbnail
00:00
00:00
1x
  • 0.5
  • 1
  • 1.25
  • 1.5
  • 1.75
  • 2
This is a podcast episode titled, From data to decisions: The future of OSINT. The summary for this episode is: <p>Recorded live at DSEI 2025, this episode features host Kate Cox and a panel of Janes experts, Sean Corbett, Leendert Van Bochoven, and Phil Smith. They discuss the evolution of open-source intelligence (OSINT) and its vital role in global security. The panel examines the challenges posed by an overwhelming volume of data in the digital age and explores how artificial intelligence, machine learning, and automation are transforming the collection, processing, and verification of open-source information and why the human analyst is more important than ever.</p><p><br></p><p>Watch the video of the recording on our YouTube channel:&nbsp;https://www.youtube.com/watch?v=-8qE-hKSG5g</p>

Speaker 1: Welcome to The World of Intelligence, a podcast for you to discover the latest analysis of global military and security trends within the open source defense intelligence community. Now onto the episode.

Kate Cox: Hello and welcome to Janes World of Intelligence Live, coming to you from DSCI 2025. I'm Kate Cox, your host for today, and the director of strategic programs in Janes' analysis division. Today, we'll be talking about the future of open source intelligence, a popular topic with listeners of the Janes Podcast. And to guide us through this important topic, I have our panel next to me, so listeners of the podcast will be familiar with Sean Corbett, the chair of the National Security Advisory Board. Welcome, Sean.

Sean Corbett: Okay. Hello, everybody.

Kate Cox: We also have Phil Smith with us, the chief technology officer at Janes. Hello, Phil.

Phil Smith: Hello, everyone.

Kate Cox: And we have Leendert Van Bochoven, our chief commercial officer at Janes. Hello, Leendert.

Leendert Van Bochoven: Thank you, Kate.

Kate Cox: Great. So we're here to talk about OSINT, what it is, how decision makers and analysts engage with it, and the challenges and opportunities that it presents. If there's time at the end of the discussion, we'll also open this up to some audience participation. So please do have a think about any questions you might like to ask our panelists. But before that, there is a lot to discuss. So let's dive in. Sean, to set the scene, how would we define OSINT and how can it take us from raw data to informed decision- making?

Sean Corbett: As ever, Kate, that could take up the entire podcast just in terms of defining intelligence, because there's not actually an accepted definition of it. Don't worry, I'm not going to go down a real rabbit one, but those are the who have heard the podcast before, for me, there are four distinct elements to open source intelligence, two of which are also common to intelligence as a whole. And the first of those basically is it has to be implied to a problem set or challenge or a difficult problem. It can't just sort of willy- nilly. And a lot of the challenge on that is actually setting the right question. So generally the problem set is, well, in our case security or defense related, is to enable decision makers to make their best decisions. So that's the first element of all. The second, and it's really important to distinguish between intelligence and information, is that intelligence is incomplete information. My analogy if you like is it's a jigsaw puzzle. So hopefully those people, whoever, a jigsaw puzzle is, but it's like taking the top of the box off and finding that a lot of the pieces are missing. Some of the pieces are parts of another jigsaw puzzle, and some have even had the pieces torn off. And the idea is to get to a stage where you come up with as near to the picture as you possibly can. Now, the problem used to be always about that there weren't just enough pieces. The problem we've got now is that there are probably thousands of pieces of millions of jigsaw puzzles in the same box. So that's what's changed. So that's the second thing. The third thing from an OSINT perspective clearly is it's got to be they're publicly available or commercially available, and there is so much good data out there that you can just get, either you can buy it or it's just out there. That is both a threat and an opportunity, the amount of data, which is just crazy right now, but not all of it's right and with no doubt we'll come and talk about that a little bit later. And then, for a company like Janes, and most of it's in the west, it has to be, the way we collect that data has to be both legal and ethical. Now, that's not the case always with our adversaries and we need to be conscious of that, but there are ways of collecting data that are legitimate and allow us to use them in a good way. We're talking about data privacy and that sort of stuff. So in a nutshell, that's how I define it.

Kate Cox: Yeah. Do you think it's still a useful term?

Sean Corbett: Well, OSINT or... Yeah, I do. And the reason I think it's still useful is because I don't think the community's yet in the place where OSINT is just a normal intelligence. It's a really good question because we need to start thinking beyond OSINT and beyond classified intelligence because it is incumbent on an analyst who consider all elements of the data, whether it's from classified sources or not. And the intelligence community has been through a journey definitely over the last probably 5 to 10 years. Five years ago, there were many in the community that wouldn't consider open source intelligence to be an intelligence discipline at all. Now, we're beyond that now. In fact, we're getting to a place, and we might talk about later, where is OSINT a relevant phase, a phrase rather, because it does compartment it. Whereas if you look at the sources of OSINT, you can get commercial satellite imagery, you can get obviously social media intelligence as we call it, all other forms of intelligence, which are subsets of what we have known in the big intelligence world as again, humans, et cetera, et cetera. And we have to incorporate all those into it. So it's still relevant in terms of how we're looking to develop it, but it should be becoming less relevant in terms of compartmenting it, it's just part of intelligence.

Kate Cox: So we've talked a little bit about what OSINT is. It would also be good to talk about why it's useful. So Leendert, could you tell us a little bit about why OSINT matters and what benefits it offers analysts and decision- makers in the defense and security space?

Leendert Van Bochoven: Yeah, thanks, Kate. And let me build on what Sean just mentioned, that the definition of OSINT inaudible could be too limited, I think, because nowadays talking about information from open sources, whatever they are, and them being applied to different use cases across the defense and intelligence organizations. So when we talk about the value of what that brings to the table, I think I would summarize it by three Ts at the point in time. So first of all, I mean time, there's a time value, a benefit of what OSINT brings to the table. I think things are moving so fast nowadays, there's so much out in the open, and I think the time value of searching that data and analyzing that, using that new analysis, I think that is a major, I think, advantage and the value it brings to the table for defense and intelligence organizations. By the way, not all OSINT is created equal. We're not just scraping the internet, but this is also the question, for example, the analysis, where does the analysis get into that before it's being used in decision- making processes or not? So there's a time element, I think, which is really very, very important. And of course, what we as Janes bring to the table is a bit of context that information as well, because over the past years we've developed a data model or ontologies and so on that describe the data in its context. And so, that gives the analysts, I think, a time advantage of using that data in their decision- making process. I think it's indispensable nowadays in the contemporary intel environment to use that information. And recently, the number of sources that are from the open source is just growing every day, and yet on the other end they've got the time issue, that how fast can you derive insights from that. So I think one of the key values I think and benefits OSINT brings to the table is a time element. The second thing is I think is trust. If done properly, they will bring an element of trust and just let me talk from Janes perspective. What we bring together is foundational data, current data, so on, we've consistently and accurately assessed that. That gives an element of trust, I think, to the OSINT data that they're using. And especially in the environment in which I operate as well, I mean, for example, in NATO, there's a lot of importance given to the trust of the information. Can you trust that source? Can you use that in your processes? And is it then shareable, then also across the coalition so that you can build on that? And do you trust where it came from? Do you trust the model by which it was created and so on? Is it shareable across coalition, but also is it shareable across use cases? And that again, builds trust. So if you can use the same data or the same underlying OSINT, so to say, across your operational systems and your planning systems and your long- term capability systems, that again helps build trust, I think. Maybe the third value I want to bring in here, so besides time, trust, I think it has transformed informational value as well. If done properly, what we're trying to do is integrate data at the data level. And basically, what we're trying to build is that the interconnected data set that can be used to further enhance and further integrate other sources in that as well. And that provides transformational value. I think we're not just integrating things at the screen, so to say, or at an application level, but we are fundamentally integrating data and bringing those in together in a trusted environment, in an environment where you can exploit that through integrated means. So I think time, trust, and the transformational value, I think that's what our customers are looking for. That's what defense and intelligence organizations are trying to apply. And if you then on top of that would say, " I want to apply modern technology exploit that," like AI, for example, you do need information that's like this, will you be able to exploit this at scale and relevant in the context much you want to exploit it. So if you want to use AI to exploit it, I think data sets, OSINT data sets are crucially important for that. So that's I think the value of what OSINT brings to the table if done properly.

Kate Cox: Yeah, absolutely. Well, a couple of points on technology that I definitely want to come back to on Phil, but turning to some of the challenges, thinking about the processing challenge in particular. So I think Leendert, Sean, you've both touched on the vast amount of OSINT at our fingertips now, which is both a great resource for researchers, but also presents a huge processing challenge. And some talk about it now as the democratization of information, don't they? So Sean, how can we navigate some of these OSINT challenges?

Sean Corbett: So firstly, I want to talk about democratization of data because an easy phrase, but it's a really bad one because not all data is equal. And this does help to answer your question, actually. So how do you get through all those jigsaw pieces to get to the right one? Well, it's down to trade craft effectively. It's being able to validate, assure the pieces of information, check your assumptions, and really work through in an objective way is that data helping me to answer the question? And you can't, nowadays, you just cannot do it in the old way. I bet you they're out there. There are people within the intelligence community still using Excel spreadsheets to manage their data and then come up with the good things. You just simply can't do that now, so you're going to have to in some way, and I'm sure Phil will come on, talk with us in a minute. In the collection phase, you've got to be able to filter to an extent where you get all of the relevant information, but only the relevant information. That has always been a massive challenge for the intelligence community. There is just simply too much information out there to do it in a manual way. And I think the second thing to say really is the integration of classified and unclassified data, which we've already mentioned a little bit, but it is incumbent upon an analyst to make sure they use all the information. You hear the term all- source analysis. Nobody does all- source analysis because they don't have access to all source. So it's at best it's multiple source, but you are almost, by working behind firewalls and all the rest of it, you are already unconsciously biasing yourselves to certain data and that is a real challenge. So Phil, do you want to talk a little bit more about that side and what you guys are doing to get through it?

Phil Smith: Yeah, I think for me, the rolling cycle for the AI is to be able to do the collection and the discovery. So what you might call narrow AI, be able to summarize information, be able to drive it forward. So in the context of the cycle, what you're trying to do is make sure that the technology is teaming with the humans in that process and is pulling the data through. So I think at Janes, and in a lot of other organizations, what you're seeing is people try to process the volume, filter it down, discover the things that are relevant, and then be able to push them into the analyst to then be able to do that contextual analysis.

Kate Cox: Do you see, Phil, technology and particularly AI as an enabler or accelerant for OSINT?

Phil Smith: Yeah, I think it's both. So the potential for AI is obviously huge. I think there are challenges about how you execute it, how you engineer it, how you manage expectations. So people are used to using AI, ChatGPT day to day as a personal productivity tool, but using it inside an OSINT cycle is quite different. And so, there's a piece about, and I'll go into this a bit later on, there's a bit about how do you use it, how do you leverage it in a really efficient way? Because what you don't want to do is create noise. So when we at Janes first started using AI to process data in our discovery pipeline, what we were generating is a lot of noise and then the analysts had to filter back out. So over time, we've been tuning that to try and be very, increase the efficacy of what we're doing so that what you're doing is feeding things that are useful to the analyst rather than feeding... What you're trying to do is go from a high noise ratio to very, very low noise ratio, not from high to medium. I think that's important, following up on your point, Sean.

Kate Cox: So you talked a little bit about the human machine teaming as part of OSINT. How can we do that most effectively? How does that work the best, really?

Phil Smith: So for me, I think if you think about what AI is really good at, it's good at processing language, it's good at managing high volumes of data. So one of the key things at the moment is we want to use AI inside the process. And so, if you think about what's happening in the AI space at the moment, the models are evolving incredibly quickly. What's also happening is that the frameworks, you need a framework to then embed the AI model into a process. Now the frameworks are very immature at the moment. So what we're seeing is the evolution of frameworks that then allow people to embed the AI into an effective OSINT process in a more maintainable way. And one of the really important things about this is any AI solution you build today, because the models are evolving, because the frameworks are evolving, I can guarantee whatever you build today, you'll be rebuilding in six months time or a year's time or two years time. Well, actually, probably all of them. So one of the interesting things about that is if you don't want to build a thing that's very rapidly going to become obsolete and un- maintainable, what you have to be able to do is engineering things to be modular. So if you look at agentic AI, agentic AI obviously has great potential for OSINT, but actually you're going to have to build it in a modular way because lots of the bits you're going to be replacing every three months because otherwise other people are going to build something. If I was to build a thing in six months time, it'll be better than the thing I would build today. So building things so that they're maintainable, so that it's sustainable, so that they will continue to add value and push the boundary of what's doable is going to be really at the core of anything you do.

Sean Corbett: And I'll just add that I think we need to, particularly in the analyst world, we need to get comfortable with that. We've got to get comfortable with consistent development. We're very good about setting up policies and our trade craft and all the rest and go, right, that's it. Now, it's set. We can't think about it anymore like that. We've got to think about it, okay, what's the next thing? What's the next thing? What's the next thing? Just back on the AI and how useful it's right now, just a quick one is that one thing AI does very well is trend analysis, pattern analysis. So I always go back to the Arab Spring, where had we had AI at that stage that was looking at say the global wheat prices, we would probably have predicted the Arab Spring in terms of all the different things that contributed to making it happen in the way that it happened. But all the other inputs as well that were happening around the world, we should have been able to predict it actually. But that still would've need the analyst saying, " Okay, what does this mean?" And I think it's going to be a while before, if ever we get to the stage where the analyst can go, " I don't need to do that anymore." You've got to have, at some stage, the human has to be in the loop. And a lot of that comes with, everybody thinks they can do OSINT right now by just Googling stuff. It's just not like that. Your analyst is pretty in- depth, understands the area in which the end, the context, the so what, the what if. And so if anything, and I don't know if we're going to have time to come onto misinformation, disinformation, but if anything doesn't look right, they will have that background, that knowledge, that experience to go something is not quite right here. And then that's when you can go in and say, " Okay, let's have a look at the data, what's accurate, what isn't." So it's a complex issue.

Leendert Van Bochoven: So I would say three elements, what I'm constantly hearing, so to say, it's about the data itself, the trustworthiness of the data, but the analysts are an essential part of that. And the underlying technology about how do you deliver that, how do you accelerate that cycle? The combination of these three, the data, the analyst and technology being brought together, that will give that I think the necessary exploration of the use and adoption of OSINT in decision- making.

Phil Smith: I totally agree. And also one of the things that gets overlooked is people talk about the data, but the AI engine needs context as well, which you talked about earlier, Leendert. And to provide context, you don't just need the foundational data, what you also need is the data schema, and what you need is the data framework that then integrates into the technology framework. So what you see is you've got this thing about vibe coding. I can build a proof of concept for something to do with AI and OSINT in a day, but it'll be wildly inaccurate or create me inconsistent results. To build that same thing, but well engineered, sustainable will give a high degree of accuracy up into the high 99% and above, that requires the context. It requires structure, it requires all those engineering disciplines that allow you to then create that consistency, because picking up on the point you'll make, one of the risks is actually, particularly with the next generation of analysts, they will believe the AI at face value at some point. So at some point you have to be very, very, very close to 100% efficacy. Otherwise, you're going to be generating your own misinformation. And that's only going to get worse because when announced in the early stages of AI models being trained on data generated by other AI, and that's going to create all sorts of odd effects that people don't really truly understand yet.

Kate Cox: So I think conscious of time, we'll move towards the end of the discussion now and each pull out a key takeaway to leave our live audience and podcast listeners later on from the discussion. I mean, from what we've just been discussing now, I think for me it's the dual importance of people and technology and also data as you mentioned just now. So we rightly talk a lot about technology and the promise of AI, but we shouldn't overlook the continued importance of the analyst and the human in the loop as well. So I think as Sean was saying earlier, technology can help us cut through the noise and identify patterns in a way that would take an analyst a very long time to do manually. But equally, that's critical thinking and contextual understanding and judgment is an essential piece of the puzzle which the analyst provides too. So I think the bottom line for me is we need both. Sean, what's your key takeaway?

Sean Corbett: So mine is we need a sense of urgency. The title of this is The Future Intelligence. Well, the future intelligence is now, it was actually five years ago, but we can't go back in time. But the community at large, whether that is coalitions, whether that's national governments are still worrying over, " Okay, where does OSINT fit? What does it even mean?" I mean, there was a US intelligence community policy document strategy, sorry, came out last year, six pages, took me two years to write it, and everyone's going, " Oh, excellent, we're done, tick." But who is actually doing it for real in an efficient way? You're still having discussions about do we need an OSINT agency, which for me would be absolutely crazy. So the usual thing with any government organization is that we need to, particularly in light of what Phil and Leendert were saying, we just need to embrace it and throw it in there, integrate it as much as we possibly can into the normal processes and bring it to light. And I'm just not sure we've understood the urgency because you can be sure that our adversaries and potential adversaries out there are doing exactly that. They don't have the same ethical or legal concerns as we've seen overnight. So we've got to get real about this and we've got to start doing it properly. And the only way to do it properly is true. And of course I would say this, but it is true through government industry partnerships.

Kate Cox: Absolutely. Phil, what's your takeaway?

Phil Smith: My takeaway is this requires a degree of execution discipline that is just the same level of execution discipline we've always had. And therefore, what that leads you to is we're in a race. We're in a race for talent, analysts, technologists, data people. We're in a race for innovation of the technologies. We're in a race for building data centers that can be able to run that many GPUs. So it's picking up on the same thing, but from a different angle, which is there is a belief in some quarters that we will be able to execute this in very light way and the technology will do the work for us. We've been playing that for 30 years. It's never happened. We're in a race for talent, we're in a race for technology innovation, and we have to shrink the cycle time, which we've heard a lot of people talking about here at DSEI over the last couple of days on all this and you've got to shrink the cycle, got to go faster. And I think it's inaudible.

Kate Cox: And Leendert?

Leendert Van Bochoven: Yeah, let me build on that. I wrote down the word accelerating. First of all, so many things are accelerating. The future is now basically, and the number of data sources will be more and will be, and the pace of that will be accelerated. For me, the big question is how do we connect that to decision making? How do we ultimately ensure there's a connection between the data based on which we make our decision? So decision advantage that everybody's looking for, I think the holy grail is still how do we do that? And there will be an interplay between humans, data and technology that will actually enable us to make better and faster decisions. So I think for me, it's all about making better decisions at strategic operational tactical levels and what can open source intelligence contribute to that and not hinder, so to say, in accelerating pace of decision making. So that's I would say that, for me, the main takeaway.

Kate Cox: Thank you, and I'm sure that would be a great springboard for another 30- minute discussion, but I'd like to open this up now to any audience questions that there might be. So please just raise your hands if you have a question for our panelists. Okay, so just to recap the question, it was if we accelerate too quickly, will the challenge of mis and disinformation become even more of a problem.

Sean Corbett: That's a great question, and as you know, it's one of my real bug bears, is that disinformation we could do, we have done several podcasts on it. From my perspective, regardless of what's happening in the world, the greatest threat to western civilization is disinformation. And the reason I say that is because there is so much out there, you can choose to believe what it is you want to believe, and the filter bubbles and echo chambers, all those things that come across. We get fed stuff that we want to see. That is a real threat because with all the data out there, as I said, talked about the democratization of data, not all data is the same, and a lot of it is not true. So how do we get through that? Well, it's a real challenge. So it's got to be a combination of AI, but how do we know that the AI is actually telling us the truth? I mean, if you go and ChatGPT, probably your first source will be probably Wikipedia, just saying. But equally, it's back to the analytical piece. So it is a very, very big question and one that I think we're still struggling with. Yes, you can have counter AI, but some of the deep fakes, particularly on the video and the imagery side now are so good that even the counter AI cannot identify it. You extrapolate that to the future and that's quite a scary place to be.

Phil Smith: From a technology angle, the thing I would say is because the thing was about acceleration. As we accelerate, one of the things we're going to have to be able to do, and I didn't touch on this before, is create feedback loops. So if we're going to use AI extensively, we have to create feedback loops into the AI, and this is one of the things that we're not very good at yet with the technology. Third, we want to create learning systems that can learn to spot disinformation and misinformation, false positives, false negatives coming out of the AI answers. If we don't do that over time, one would imagine that the quality of the information as we get back from these systems will degrade. So I think a really, really important part of this whole cycle is how do you create learning systems that continue to evolve, and then that's got to be part of your foundational capability because otherwise your cycles will spin too quick for you to be able to keep up with.

Kate Cox: Great. All right. Well, foundational capability, that's a great segue into tomorrow's discussion. We're going to be talking about the importance of foundational intelligence at 1: 30 P. M. So if you're around tomorrow, please do come and join us for that. But that brings us to the end of today's panel. So a big thank you to our panelists, thank you to our audience live here at DSEI today for joining us. And thank you as always to the podcast listeners, catching up on this later. Thank you and goodbye.

Speaker 1: Thanks for joining us this week on the world of intelligence. Make sure to visit our website, janes. com/ podcast, where you can subscribe to the show on Apple Podcasts, Spotify, or Google Podcasts. So you'll never miss an episode.

DESCRIPTION

Recorded live at DSEI 2025, this episode features host Kate Cox and a panel of Janes experts, Sean Corbett, Leendert Van Bochoven, and Phil Smith. They discuss the evolution of open-source intelligence (OSINT) and its vital role in global security. The panel examines the challenges posed by an overwhelming volume of data in the digital age and explores how artificial intelligence, machine learning, and automation are transforming the collection, processing, and verification of open-source information and why the human analyst is more important than ever.


Watch the video of the recording on our YouTube channel: https://www.youtube.com/watch?v=-8qE-hKSG5g

Today's Host

Guest Thumbnail

Harry Kemsley

|President of Government & National Security, Janes

Today's Guests

Guest Thumbnail

Phil Smith

|Chief Technology Officer, Janes
Guest Thumbnail

Kate Cox

| Director of Strategic Programmes in Janes Research, Data, and Analysis (RD&A) department
Guest Thumbnail

Leendert Van Bochoven

|Chief Commercial Officer, International, Janes
Guest Thumbnail

Sean Corbett

|AVM (ret'd) Sean Corbett CB MBE MA, RAF