March 9, 2021

The Origins and Future of Open Science

The Origins and Future of Open Science

George Strawn, computing policy nestor, interviewed by, interviewed by host Trond Arne Undheim, futurist, investor, and author. 
In this conversation, we talk about the Origins and Future of Open Science. We investigate the decisions that turned ARPAnet ...

Apple Podcasts podcast player badge
Spotify podcast player badge
Google Podcasts podcast player badge
Amazon Music podcast player badge
Overcast podcast player badge
Castro podcast player badge
Stitcher podcast player badge
iHeartRadio podcast player badge
PocketCasts podcast player badge
Castbox podcast player badge
Podchaser podcast player badge
TuneIn podcast player badge
Deezer podcast player badge
Pandora podcast player badge
RadioPublic podcast player badge
Podcast Addict podcast player badge
YouTube podcast player badge
RSS Feed podcast player badge

George Strawn, computing policy nestor, interviewed by, interviewed by host Trond Arne Undheim, futurist, investor, and author. 

In this conversation, we talk about the Origins and Future of Open Science. We investigate the decisions that turned ARPAnet into the global internet and the first ISP via educational institutions. We discuss the rise and fall and rise again of the Office of Science and Technology Policy at the White House, the role of the National Academies in the US and abroad, the path towards open science with open access, preprint servers, and why big science publishers resist it. George muses on the role of science and data in the next decade.

My takeaway is that the origins of open science and the internet were a combination of savvy futuristic planning, and surprising twists and turns. The magnitude of the changes have been felt by all. The future of open science still looks open ended, but the promise of bottom-up self-regulation is more alluring than the alternative, a regulatory grab to avoid damaging lock-in effects. Data is the new business model, but the holders of big data become the arbiters of human destiny. Can we achieve George's vision of one computer, one dataset? The implications would be world-changing.

Having listened to this episode, check out National Academies as well as George Strawn's online profile:

Thanks for listening. If you liked the show, subscribe at or in your preferred podcast player, and rate us with five stars.

If you like this topic, you may enjoy other episodes of Futurized, such as episode 84 The path towards Science 2.0, episode 48, The Future of AI in government or episode 29 Future of Computational Media.

Futurized—preparing YOU to deal with disruption.



The Origins of Open Science_mixdown

Trond Arne Undheim, Host:[00:00:00] Futurized goes beneath the trends to track the underlying forces of disruption in technology, policy, business models, social dynamics, and the environment. I'm your host,  futurist and author in episode 85 of the podcasts, the topic is the origins and future of open science. Our guest is George Strawn computing policy.

[00:00:23] Next door in this conversation. We investigate the decisions to turn argument into the global internet and the first ISP via educational institutions get discussed the rise and fall and rise again, the office of science and technology policy at the white house. The role of the national academies in the U S and abroad the path towards open science with open access preprint servers, and why big science publishes resistant.

[00:00:49] George uses on the role of science and data in the next decade. George. How are you today?

[00:00:56] George Strawn, Computing policy nestor (guest):[00:00:56] Good day. I'm very well and hope you are

[00:00:58] too.

[00:00:59] Trond Arne Undheim, Host:[00:00:59] Yeah, it's an excellent day to talk about history and future. I agree. Yeah. Look George, you have quite an illustrious career spanning a few decades of internet development from way back when there was no internet.

[00:01:18] I wanted you to take us a little bit on a very important journey because where we are headed today and where we were some 30 or more years ago they are pretty interesting junctures in time. And I wanted to you to reminisce a little bit about the beginning of the internet, but also the specifically the beginning of open science that the decision decisions you were part of that turned ARPANET into something.

[00:01:47] I guess, history says far greater than you imagined, but maybe you had already imagined it. Can you bring us back to that time and give us a sense of first of all, just explain what your role was and give us a sense of what was the mood back then w what did people think this animal, the ARPANET was going to be

[00:02:07] George Strawn, Computing policy nestor (guest):[00:02:07] real pleasure.

[00:02:09] I'll mention. Before that, that this happens to be the 60th year of my involvement in the it industry, or actually only the last half of my career has been specifically internet focused. It was at a university in the sixties, seventies, and eighties, and was involved with various networking activities.

[00:02:34]Not our format because they were only a few wealthy universities that were on the ARPANET. But we developed some local things. I actually spent a year with a computer networking startup in 1970. We developed other things by the mid eighties. He's the computer science community was moving forward.

[00:02:55]Using the TCP IP protocols regularly because the latest version of Berkeley Unix had TCP IP in it. I was chairing the computer science department at that time. So we immediately started using TCP IP on our Berkeley backs. And then shortly thereafter and NSF announced a program to create the NSF net.

[00:03:21] Which would use the TCP IP protocols and take it from a very important experiment to a larger network connecting us research universities to the new supercomputing centers that NSF was standing up. So our

[00:03:39] Trond Arne Undheim, Host:[00:03:39] judge, I just wanted to stop you back in the days as you recall, it was there anyone. That had the idea of this becoming something different than just connecting more and more researchers.

[00:03:54]And just also even just that as an objective, what did you envision was going to happen over that network?

[00:04:01]George Strawn, Computing policy nestor (guest):[00:04:01] We had the best of all possible worlds in that I'm sure that our most leading visionaries. Knew that this could be a very important development in the future. At that point, there was no universal network.

[00:04:17] That is if you had IBM computers, you used IBM network. If you have digital equipment computers, you'd use digital equipment and networks and so forth. And so in compatibility rain cylinders of excellence, as we'd like to say, because of DARPA's extraordinary vision in the early sixties. Realizing that we really ought to have universal connectivity.

[00:04:41]We could see that this ought to be important sometime in the future. I think most of us thought it would take longer to become important than it did. But the other side of the coin is that we had a very specific need to connect 100 research universities to five super computing centers. And therefore getting the political will and the money to do such a limited project was much easier than saying we were going to change the world.

[00:05:10] Most people don't believe you. When you say you're going to change the world. That's

[00:05:14] Trond Arne Undheim, Host:[00:05:14] fascinating to say that you had somehow created this need, because I want to just very briefly bring us back to kind of COVID days on the last year. I guess when you create a need like that for remote collaboration, which this was right.

[00:05:30] And the enormous computing infrastructure stuck in a few locations and the fact that these research universities were distributed. I just but even just the audacity of saying that was going to be possible. Surely even just a few years before that actually happened. Where there doubters back then, I'm not asking you to out them, but surely this can't have been.

[00:05:53] Extremely easy because it was a research effort very much, in the early days, it wasn't like this was crystal clear to everyone that it was all going to work this way.

[00:06:03] George Strawn, Computing policy nestor (guest):[00:06:03] Correct. It really was a research oriented infrastructure development. In fact, super computing was playing the primary role. A high level commit committee in the United States had complained that both Europe.

[00:06:20] And Japan were giving their faculty members more access to super computing in the U S and that's how the it ultimately amounted in legislation, directing NSF to create super computing centers and at the end of that report, I said, Oh, by the way, you ought to also create a network so that people at those universities don't have to go to the super computing centers for all their interactions.

[00:06:47]At that time, ARPANET was very experimental digital equipments deck net was pretty operational and there were some people who said, Oh, we really ought to use DECnet because it's a, it's proven itself. It's operational, it's commercially supported. The opp obvious are their argument. Yes. But it limits you to digital equipment computers.

[00:07:09]So there was quite a A discussion in the mid 1980s at NSF about what to do final decision was made. I would say it was a risky decision at that point to use TCPI B due to its universality. And then we proceeded from the mid eighties to the mid nineties to make it work

[00:07:29]Trond Arne Undheim, Host:[00:07:29] I'm fascinated by that because, I know that you have a longstanding.

[00:07:33] Interest and obviously experience in interoperability it's and it seems to me that at several junctures, during this internet history, there were choices that could have been made to arguably close it off to people with only a specific type of equipment. And by that token, either a certain socioeconomic group or a certain professional group, or even an elite group of universities, why do you think.

[00:07:59] It was possible to win three, or do you think it was almost luck that decision was made at the end of the day?

[00:08:06]George Strawn, Computing policy nestor (guest):[00:08:06] I think the whole history is strewn with luck. As the old saying goes, it's better to be lucky than smart. And it turns out at every point decisions that were made to favor openness as opposed to favor close, goodness.

[00:08:22] It is soon after we B we had what, the hundred research universities on the network. Many people were saying at least quietly, networking is going to be important for much more than supercomputing access. And so NSF launched a program to encourage the rest of higher education in the United States to also connect to the internet.

[00:08:46] The internet had been designed very appropriately in that it's a three tier network, the NSF net backbone, which stretched from coast to coast and then regional networks, there were seven or eight or nine regional networks around the country because each one had a few research universities in it, but then NSF gave awards.

[00:09:08] Who other universities and colleges to connect to those regional networks and the regional networks were anxious to expand their membership, to pay their costs. They did that. The other colleges and universities began to connect. By the time I went to NSF as the NSF net program officer in 1991, there were Oh, say at least 500.

[00:09:34] Colleges and universities already on the network, but even more amazing in my mind was that in 1991 already, there were about the same number of private sector companies. The regional networks, just watching to see what would happen to keep their eye on this. So in my regard, I think the commercialization of the.

[00:09:58] NSF net began in 1991, as opposed to 1995, when we retired the NSF net and the commercial providers took over entirely

[00:10:09] Trond Arne Undheim, Host:[00:10:09] well. And that brings us uh, eventually it's going to bring us to our contemporary days because there is this. Constant, oscillation between, public and private interests and you can't really build something as important and big as the internet, without the collaboration of many different actors, but bring us then.

[00:10:29] So we are now in the early nineties and Tell me about how science is doing generally throughout this period, because we are talking here about the origins also of open science. What was science a shared reality among? Obviously among these research universities and then the us government at that time was I guess you could characterize it as he was in a heyday and optimism, or there were tasks that the government really trusted science to do.

[00:10:56] W how did that evolve in this period?

[00:10:58]George Strawn, Computing policy nestor (guest):[00:10:58] Open science meant something entirely different in the 1990s than it means today. In fact, it meant in the 1990s about the same thing that it meant since the 17th century, when the Royal Academy in Britain suggested that scientists publish their articles rather than keeping them private.

[00:11:19]So often science meant then, and of course still means to agree today. When you have results, publish them in peer review journals so that other people may see those results and build upon them for their next contributions to science. There wasn't enough data and certainly the data was not interconnected well without the internet.

[00:11:42] So open science did not involve data or other artifacts. It was still Openness meant openness of publication and nothing

[00:11:51] Trond Arne Undheim, Host:[00:11:51] else. Yeah, no, that's very interesting. And we'll get to the role of data soon enough. But you have had more than a little share of this experience around the national academies, certainly in the U S if I look into the history of the national academies there w the first one I guess, was started in 1863, and then they rolled on, there was another one in 64, and then in 1970, I don't have exactly that. The names of which of these academies started when, but you at some point in your career, I have been in and out of the national academies.

[00:12:24] What kind of an institution is that and how important, how they been to. Science

[00:12:30] overall,

[00:12:31] George Strawn, Computing policy nestor (guest):[00:12:31] I think quite important. When I retired from NSF in 2015, I took my retirement job, quote unquote at the national Academy of sciences was first and then engineering and medicine were added in the later days. So

[00:12:48] Trond Arne Undheim, Host:[00:12:48] now, that's what I was referring to. So 1863 and 1964 and 1974.

[00:12:53] George Strawn, Computing policy nestor (guest):[00:12:53] That's right. The original goal of course, was to give the government. Access to scientific expertise. We may ask for it regarding projects and future of science and so forth. My opinion, it's been extremely important over the course of my career at professional meetings around the country.

[00:13:14]Speakers would stand up and say, I recommend that you look at this new report from the national Academy of sciences, which I think is very important and which is going to do direct future of science in this way or that way or whatever. So the ability, actually, the Academy has two goals. It's honorific in that the top scientists, engineers, and medical people are elected to the national academies.

[00:13:39] And then it's the other is the working opportunity to. Select, not only those members, but also other distinguished members of the science, et cetera communities and produce reports that will be important to direct the future of a scientific enterprise and engineering and medicine.

[00:13:59] Trond Arne Undheim, Host:[00:13:59] Yeah. It's a curious, and obviously historically a very important institution, I do feel that, starting with my generation on younger, it hasn't really had the kind of.

[00:14:10] Visibility that makes us too. Look at these academies as other things, then almost like historical anachronisms which is strange. I find, how do you explain that the academies have such a low profile comparatively? There, there are of course famous scientists. So today, but I wouldn't say they're famous because they're recognized as members of one of the academies and maybe I'm wrong about this, but they don't seem to have the kind of public profile that they must have had.

[00:14:40]In the early years,

[00:14:41]George Strawn, Computing policy nestor (guest):[00:14:41] I'm not so sure that the profile has changed. It is a low key organization. I think that for any scientist engineer or medical researcher who is a member of the Academy, you would find that is definitely features prominently on their biographies. It's a great honor.

[00:14:59] And if they are applying for another job or whatever, it would feature prominently. It might even be mentioned if there's a public relations activity. Not a contribution somebody has just been just made. They would probably include, Oh, and they're also a member of the national Academy of sciences.

[00:15:16] Trond Arne Undheim, Host:[00:15:16] Yeah, I was, more talking about the institutional power, but then maybe that's also my ignorance, like you pointed out, in the Royal society right now, 60 and 60, we're talking here, the first scientific journal, the philosophical transactions, five years later, those were grand statements.

[00:15:32] They were first, it was the first journal. People didn't know what a journal was. It hadn't existed, until that point then bring us up to today. And so these academies in the U S so starting 1970, we had the last one sort of come on board. What has happened then with both sort of the academies as institutions and then the public endorsement of science.

[00:15:53] I know you are also involved with the office of science and technology policy at the white house, which I believe is an institution. That started in 1976. Tell me what happened to that institution?

[00:16:06] George Strawn, Computing policy nestor (guest):[00:16:06] Oh, STP is still there. It tends to have more prominence with democratic administrations and somewhat less prominent with Republican administrations, but it does continue on for example, Republican administrations may not add to the director of RSVP to the title president science advisor.

[00:16:27] Democrats, always a president science advisor and this year, president Biden in fact has raised that to a cabinet position. So now the director of OSD is a cabinet member, as well as the science advisor. That may comment just one minute further on the profile that you were talking about. Remember the original goal.

[00:16:50]That Lincoln signed into office was to create an entity that would give the government advice scientific matters. So that's often a let's say a low key, not necessarily a high public business, but we write reports many times are very influential in the government's determining what scientific programs to put forward.

[00:17:12]Things like accepting super computing is important, accepting networking. Since the internet, there've been several national Academy reports talking about the importance of the internet and these influence government programs and government legislation. It may be publicly a rather quiet activity, but from a policy standpoint, I think it has been and remains quite influential.

[00:17:37] Trond Arne Undheim, Host:[00:17:37] Yeah, no I accept that point. So as we are then moving more towards the later nineties, the early two thousands bringing us in on, on what then starts to happen in open science and how this new group of by this time, fairly influential science publishers. So this new group that we haven't talked about yet starts to emerge on the scene with obviously not just one journal, but the plethora of journals.

[00:18:01] And there starts to become a business model around. Publications that starts to take some prominence. How would you describe this? Describe that movement and and what is happening to it? No,

[00:18:14] George Strawn, Computing policy nestor (guest):[00:18:14] well, of course the first business model around scientific publications is around paper journals and that has been in existence for several hundred years.

[00:18:24]17th and 18th century, fairly small number of journals, 19th and 20th and exploding number after world war through a huge number of journals as a whole Academy expanded very greatly. As soon as the internet came on the scene, people could see, Oh wait, we need to have electronic versions of the journals, not just paper versions.

[00:18:44]This was a I think it's fair to say a conundrum for the traditional science publishers who by this point are an important industry worldwide and a, an important lobbying influence with governments because they could see their, a subscription to journals, model of financing under threat.

[00:19:05]And so I would say We are just now emerging from an extended battle where the publishers by and large are accepting that things are changing because of the internet you and Europe know that even better than we, because Europe has been taking a lead in terms of national efforts to break these subscription model.

[00:19:27]Two journals and moving forward into an internet based model. It's and it raises the whole question of what is the future of scholarly communication. If you've got enough disc storage, enough computing, power enough, enough networking connectivity, all of a sudden you can, all of a sudden you can think about is the article, the only thing we should be publishing.

[00:19:50] Should we be publishing our data. Should we be publishing our workflow, methodology and other aspects. In other words, in this new century, the idea has gone on people that all outputs, all science outputs, data articles, software, you name it, hand me and should be published. Because it will give other scientists again, more to stand on is they reach for the next level of development.

[00:20:20] Trond Arne Undheim, Host:[00:20:20] So there are a bunch of terms here that were hotly and are, still hotly debated. Although like you pointed out in Europe the grand bargain seems to have been somewhat struck, open access. How do you explain that term? It is somewhat hijacked also. It can have a lot of different meanings and then very lately the the concept of a pre-print.

[00:20:45] Has gotten, certainly with COVID it became something almost on everybody's mind. If you were interested in what was happening, everybody was reading these pre-print articles, but first of all, the notion of open access, how old is that notion? Does it, was it born with it early interest?

[00:21:00]George Strawn, Computing policy nestor (guest):[00:21:00] I think so.

[00:21:01]He and as you say, it has many definitions, so it's probably best to give a definition and then see if we if that's useful. What in the U S and I suspect worldwide, what we said is if a scientific resource is available for open access, that means that you, the user who wants that information has to provide for yourself and internet access.

[00:21:25] And beyond that access to the resource data articles, whenever it should be free, quote unquote, we know that those are expensive to maintain, but if the if the expenses can be borne by other models, then you greatly expand the reach to to non-research intensive universities, to countries around the world, et cetera.

[00:21:49]You really do have open access, just was the original 17th century goal. It's just, now you have many more people have access to much, much more resource.

[00:22:00]Trond Arne Undheim, Host:[00:22:00] So preprints what, how, what would you. W when did that originate? These pre-print the electronic preprints seem to me a very recent invention, but the notion of circulating drafts is of course not new.

[00:22:15]George Strawn, Computing policy nestor (guest):[00:22:15] Surprisingly just like when the internet went public in 1995, after NSF net was retired and the private sector took over. It was just at that time that the public at large became aware of the internet and had no idea that its development had been going on for 30 years and situation is similar with preprints because that concept and the first pre-print server was developed in the early nineties, 30 years ago.

[00:22:45] Trond Arne Undheim, Host:[00:22:45] That's incredible. That was nice. Even to me, I would never have thought that was that early.

[00:22:49] George Strawn, Computing policy nestor (guest):[00:22:49] Yeah. Paul Ginsberg, a physicist. Who was at Los Alamos laboratory at that time. And subsequently moved to Cornell university approach the physics journals and of the national physic physics society and said, I'd like to put up this pre-print server and I'd like you to agree that does not constitute prior publication.

[00:23:14] And therefore that articles on the server should still be available for publication after peer review in your journals to the great credit of the physics society. They accepted that suggestion. And so archive the original archive. So AR K I V E was stood up in the early nineties. Immediately took off in popularity and physical sciences and was expanded somewhat through a few other disciplines.

[00:23:44] Now we have bio archive and other types of archives coming up. I think this was a

[00:23:52] one of his Seminole developments. And as you see, it was a policy development that was orchestrated first by the American physical society that enabled that. And and so for 30 years, scientists have been able to pre publish their articles, getting a peer review informal peer review from the community at large.

[00:24:15] They also show their priority by the date that they put it in the in the preprint server. And there's an article articles appear in the science magazine from time to time, which describes it that any scientist will say I go into the office in the morning. I look in archive to see if anything new has been published in my area.

[00:24:42] After that I go to the lab and start working, knowing that I have done the live, all the literature review that I have to. So a greatly improve the activities of individual scientists, get the results out there sooner, et cetera, et cetera. So I consider this to be one of the seminal advances, as I say, 30 years ago in the direction of open science.

[00:25:08] Trond Arne Undheim, Host:[00:25:08] So it strikes me as you're talking that so much of the development that we come to call science, or, we learn as individuals science results. They are shaped by this interplay between government regulation. And obviously funding, which is an aspect we haven't talked so much about, but also private sector and these institutions and these decisions that, professional academic societies make.

[00:25:36]And these trade-offs. So there's nothing here that's given that there's no such thing that there's no, there doesn't seem to be a necessary path that science has taken. It seems like they were all decisions all along.

[00:25:48] George Strawn, Computing policy nestor (guest):[00:25:48] Yes, that's right. That's another example of better being lucky than smart. Although I would say that the American physical society was smart and ahead of their time in making that decision to permit pre-publication of articles that would subsequently occur in their journals.

[00:26:04]But every time there is everything is in my experience. Everything is contingent. Nothing preset. You go down a line, all of a sudden using the military term. There's this salient where if you move forward that direction, you can make more progress than you expected the day before, et cetera.

[00:26:25] Trond Arne Undheim, Host:[00:26:25] Bring us into the mindset of what might be a big science publisher today.

[00:26:30] Like I read Elsevier Taylor and Francis wildly Blackwell, Springer, Sage. You name it. These are maybe the top five. You could understand though how they, if they have built a business on the back of kind of one basic business model that sells. Either subscriptions or sort of content based access.

[00:26:49]It doesn't make sense rationally for them to fully just let that go because history is not on their side or something. How has this battle been? Would you say? And w where does the path lead for actors who are in that position that they have built up their legacy around the business model that perhaps now.

[00:27:09]For the good, for the greater good of society, it needs to adapt quite significantly.

[00:27:13]George Strawn, Computing policy nestor (guest):[00:27:13] And publishers are just one example. Any industry that has a given business model that is successful will resist major changes. Patrick Winston in a book some years ago quoted the line that the status quo always resists the revolutionary potential of a new technology.

[00:27:33]The telecommunication industry resisted the potential of the internet and because it changed the future of how telephone and other telecommunications in the U S and now the world operate. So if they had realized that change was going to happen, I think they might've exercised some lobbying capability to keep the internet from emerging as it did.

[00:27:58]Since it did emerge under government supervision and funding may probably, we would have had power to say, you really shouldn't be doing that. That's a private sector activity. That's what's happened for the last 20, 30 years in publishing. The breakthrough was at the a U S national Academy or at the UN national institutes of health.

[00:28:20]When Harold Varmus was director of the institutes of health in the 1990s, he launched an effort specifically in the biomedical activities saying the public has paid for this research. The public should be able to see it, and that the journals were of course dead set against it. But because of everyone.

[00:28:42] Feels that public air, that health is an important idea, and that was a successful way to approach it from a political standpoint. And eventually the federal government stood behind the NIH, his decision to force biomedical articles into an open-access machine. Subsequently when I was at LSTP, as a matter of fact, then The white house came forward with a a requirement to the agencies to do the same thing for other literature beyond biomedical, all scientific literature, and most revolutionary, they added to that, the requirement and data, not only the articles, but, and the data ought to be publicly available.

[00:29:27] Trond Arne Undheim, Host:[00:29:27] Yeah. We'll get to the data right now. I just wanted to ask you one thing. So it's. It seems to me in these discussions, many times, one kind of sloppily assumes that just because one business model dies, that all business models die, but that's not the case at all. With open science, you're just shifting the way information flows doesn't mean that there's not money in science, I guess it's just it becomes.

[00:29:54] For the publisher they have to take on a different role but in some ways it empowers the scientist and the institutions behind science due to charge for their work just in, in more ways than than one. Yeah.

[00:30:07] George Strawn, Computing policy nestor (guest):[00:30:07] This is an expensive enterprise. And. There are certain expenses that are reduced with internet distribution.

[00:30:16] You save a few for us to save some mailing expense. You save space in the libraries around the research universities. So there are savings. The real advances though, are the broader access to this information in the world. And we are still wrestling with what are the right business models.

[00:30:36] First of all the expenses should be lowered. Finding a less expensive business model ought to be available. And that is not a happy news to the publishing industry. That's if it's less expensive to grow the product, you don't need as big a revenue source as you had before. This happened in telecommunications, beginning in the eighties where 80 and T Let go tens of thousands of employees as new methods in telecommunications, we're less labor intensive.

[00:31:07] So anytime automation moves into a field unemployment follows, I'm sorry to say. And in fact, one of my concerns is as we move forward in the future, how many jobs are going to be left and what are we going to do with the people who are permanently unemployed? That's obviously a topic for a different day, but I think it's an, a, some people have said that may be the most important problem of the 21st century.

[00:31:31]Trond Arne Undheim, Host:[00:31:31] So that is a larger issue. Let's hold on that discussion for a second, because I wanted to get to your data point. And, but also this notion of science 2.0 and PR potentially that the two of them are slightly related. When people speak about science 2.0, what are they talking about?

[00:31:49] Is it just open science, which we have been talking about now? Or is it something more.

[00:31:53]George Strawn, Computing policy nestor (guest):[00:31:53] That's a good question. It turns out that's not a term that's been widely used in the United States.

[00:32:00] Trond Arne Undheim, Host:[00:32:00] It might be a European term for, yeah.

[00:32:04] George Strawn, Computing policy nestor (guest):[00:32:04] Sure. What the definition I would hazard to guess that yes, it means publishing all science outputs, not just journals, journal articles.

[00:32:12]Trond Arne Undheim, Host:[00:32:12] So tell me more about your notion of data, because I know you're keenly aware of the importance of data in, as opposed to just publishing the, their article about the data. Tell me how that is evolving and how you think it will evolve, with the role of science and data, as we move into the next decade that we have begun with a bang here.

[00:32:36] George Strawn, Computing policy nestor (guest):[00:32:36] Yes. I think the last 25 years let's say, have been network centric because of the emergence of the internet and then the emergence of huge application providers on the internet, such as Google, Amazon, Facebook, et cetera has made it central to the economy in good ways and bad force and stuff is the.

[00:33:02] Fake truth and conspiracy theories have proven that all powerful technologies can be used for ill as well as we're good. But the last 25 years is focused on great thing called the internet that is interconnected. The world. The next 25 years will be focused on data. In the same sense that the last 25 years was focused on the internet.

[00:33:26]I like to whimsically describe three eras of computing. The first era of computing was for isolated computers before the internet, the second era of computing, which we're in right now is connected computers, but Datasets that are not interoperable. Computers are interoperable, but the data on the computers are not in interoperable.

[00:33:49]Some years ago, one of the networking companies had a marketing slogan, which said the network is the computer. It is, it certainly is now, right when you use an app, that's connecting not only your computer, it's connecting the server computers, et cetera. The network is the computing resources you're using.

[00:34:06]So we now live in an era of one computer, but multiple data sets that are not interoperable in the future, the ideal world will be one computer and one data set in the sense that if all the data sets in science or beyond that are interoperable, then the ability to combine data from different areas becomes immediately a much easier.

[00:34:32] And much more progress. We'll be we'll be available.

[00:34:35]Trond Arne Undheim, Host:[00:34:35] Fascinating vision George, because I see, immediate applications and of course, so do the Facebooks of the world, right? Because in, in essence, you could, if you just mentioned health and think, the kind of coordination you could do, if you had a global data set on any.

[00:34:54] Individual aspect or thousands of aspects of all human beings, what you could do with that. But it of course goes beyond health in, into just other science, any observations, the environment, right? All of the coordination problems that we've been plagued with and

[00:35:10] George Strawn, Computing policy nestor (guest):[00:35:10] commercial activity as well as scientific activities.

[00:35:13]It is a great goal. It's not easy. And by the way, I don't think it will be done by one data set somewhere on some topic. The goal is to still have millions of data sets, but have the S the metadata and the other activities that permit interoperability between the data stored in different places, in different formats.

[00:35:36] Trond Arne Undheim, Host:[00:35:36] And we come to may George to be also a very significant regulatory challenge because. We currently don't really have international jurisprudence, that is capable of setting. Mandatory interoperability requirements for things like health, right? If we were to say every person has the right or, the UN charter defined some of these things that the people have the right to information about certain things and to be free.

[00:36:04]But if you really think about it to be free, these days is to have information exactly.

[00:36:09]George Strawn, Computing policy nestor (guest):[00:36:09] Exactly. And. The goal will be to do as much of this bottom up as possible, as opposed to by top-down regulation. The ideal, which probably isn't achievable is that we develop the technology and subsequently the laws and regulations, which support developing inter-operability.

[00:36:31]Interfaces, let's say between different data sets and and people realize the benefit. I think we the the pandemic has given us interesting and that's too. Mild a word a wake up call that the world could have responded much better. Had beta been interoperable at this point.

[00:36:52] Now, many of my colleagues, many of my colleagues have been spending this virus time pushing. Interoperable standards so that when not if, but when the next pandemic hits, we should be in better shape with interoperable data that will enable a more vigorous, scientific response to the the technical issues.

[00:37:14] Trond Arne Undheim, Host:[00:37:14] But how can one do that in the current regulatory environment where, you know, Even in a U S that are resurgent a democratic administration with a lot more ambition, on the global scene than the previous administration, we are moving into a world where arguably in some domains, there are no super powers or if there are, it's a multipolarity and it's not even clear that governments, I would say are the superpowers anymore.

[00:37:41]There's private actors and there's obviously elicit. Activity and networks that also have considerable power. How can you still be so faithful that a bottom up type initiative one would be realistic too. We'd even go into direction that you want it to go. So I'm thinking about some of the social movements of recent time that it's not very clear to see that all of those social movements are necessarily promulgating the public.

[00:38:08] Good. The words there's nothing to me. In nature, that would mean that we would all converge around interoperability. It is a decision to make, and it does have, there are always losers in interoperability, right?

[00:38:22] George Strawn, Computing policy nestor (guest):[00:38:22] Absolutely. But I guess our, again, our appeal to history and say that I have seen it work in the past, and I think it has a better chance of working in the future than a top-down solution.

[00:38:33]Take the internet again, as an example, at the time the internet was emerging the nations of the world had already agreed, including the U S had already agreed by treaty regulation, through the open science interconnect OSI, which was going to be a network to do what the internet was already doing.

[00:38:53] So that was the Daijiro network of the time. And as. Our use of TCPI IP went forward. We had to always say, as soon as OSI is available, we will convert everything to it. So internet was a defacto solution. We were waiting for the new Jura solution. Of course, the progress on the internet turned out to be so rapid that we in effect disregarded the de Jura solution and OSI became a Feature of history rather than something we were marching forward.

[00:39:26] And again, that began in the U S by the way I see more activity on data in Europe than in the us. So it's conceivable to me that Europe might have believed in data the same way that the us had the lead in the internet, or who knows whoever, like you say, it's a multipolar world. China is on the March.

[00:39:45] We could see something develop there as well. So a bottom up thing will probably develop within within a given GM geopolitical domain, whether it's Europe or the U S or China or something else. And if it turns out to be tremendously successful, as the internet was, then it starts by osmosis going to the rest of the world.

[00:40:11] The rest of the world eventually accepted that the internet was the The way to interconnect computers and other efforts were were shut down. I expect that we'll see if somebody makes similar great progress in inter interoperable datasets in one area and other areas, find it and say, Oh, we could use that too.

[00:40:33] And then it begins again from a distributed it's much better to boil a Lake than to start out boiling the ocean. Okay. And a global solution is boiling the ocean. If a smaller, more special purpose activity allows for the development of a suitable technology, then it grows. Organically from the bottom up.

[00:40:55] I've seen that work more often than not. And so my face is that is probably the way to bet on for the future for data.

[00:41:01]Trond Arne Undheim, Host:[00:41:01] That's certainly sounds wise to me. So I wanted to bring up this, that in more ways than one, you are a distance runner. So you told me you have run an estimated 55,000 miles.

[00:41:10] That's only a computer scientist could have measured that. They're total miles in a lifetime. You've been an academic computer scientist and went on to become an administrator and an NSF employee. And you have been a CIO of OSTP and in charge of interagency efforts and and then heavily involved on this academic the academies and coordination, various networks.

[00:41:33] What is your best advice? As a long distance runner for people who maybe haven't had the benefit of that distance that you have run, what are the initiatives that matter if you're going to get involved in the debates that we have been surfacing today on this podcast, what are the networks where this is happening?

[00:41:51] Where should people go to even just start to grasp the enormity of the challenges ahead of us for science and for the internet and for the, I guess the planet.

[00:42:02]George Strawn, Computing policy nestor (guest):[00:42:02] Thank you for that. Far reaching question. I do think running has something to do with it because I like to say these developments are a marathon, not a sprint.

[00:42:11]They are going to take a long time. They're going to require endurance. If you start out too fast, you're going to win yourself and end up not finishing, but preparing for a long haul. Either on the race course or on the data course is appropriate. The the one thing that is different, I would say in the data then was in the networking world, in the networking history, the private sector.

[00:42:39] Computer companies were focused on their own clientele and focused on vendor. Lock-in right. They didn't want to connect other vendors, computers to their networks. And therefore they action left far seeing action of the U S government to provide a universal solution was extremely important.

[00:42:57]Although I'm not sure every a computer company would tell you that they liked that solution. And now. We find that there are private sector players that are very actively involved in data and general data. Just let me take one example of Google. Google has a huge project called knowledge networks or knowledge graph.

[00:43:20] I think they call it a little bit. Google knowledge graph is in fact, a way to format data, which is quite. Compatible and a useful in open science activities and other types of activities. Other private sector companies are also perching forward in this area. So that the difference is that now the private there's private, probably the face can be made that the private sector is ahead of the public sector in terms of utilizing advanced forms of data.

[00:43:50] So we have to be careful in the public sector to stand on the coattails of what has already been done in the private sector, as opposed to thinking, we have to start from scratch. When I think the private sector has played the role that DARPA played in the networking directions. And then we have to make sure that we form a new form of public private partnership.

[00:44:15]In the internet, the thing that made it work was excellent forms of public private partnership. We need new forms of public private partnership to bring this off. Our goal is to make sure that whatever is developed by the private sector supports the goals of science, as well as public search and game-playing and so on and so forth.

[00:44:33] So a closer Alliance with the private sector, which brings its own difficulties. Of course. But still, I think that's something I would watch very closely for the future.

[00:44:43] Trond Arne Undheim, Host:[00:44:43] That sounds very wise to me. Yeah I hear from you the long distance runner, a kind of a cautious optimism that some bottom up and, the notion of bottom up is complicated here because you're not talking about some people in a forest to who are isolated from everything else.

[00:45:01]Bottom up, it just means that there's not a top down overly specified regulatory regime, but it emerges from the players in that are currently in the space of data. But you seem cautiously optimistic that this, that a better direction can emerge. That's

[00:45:18] George Strawn, Computing policy nestor (guest):[00:45:18] right. I'm let's put it this way.

[00:45:19]I'm only cautiously optimistic in the short term. I'm very optimistic in the long-term. This is just too good. A thing to do that human society will do it eventually because we will all recognize the importance of it. There's a journal new journal called data intelligence that I was involved in producing a special issue recently.

[00:45:42] I think it just hit the airwaves. And the article that I authored was entitled open science and the hype cycle. The console link from Gardner had created something gold, but they called the hype cycle. Some years ago that saying that many technologies go through a peak of expectations and then a a decline into a Valley of despair after they found out it wasn't a silver bullet to find all problems, solve all problems.

[00:46:10] But then for those technologies that really are going to amount to something we cry, climb back up onto a plateau of productivity. And I think we are at the peak of expectations with open science right now. And I think we have a number of problems you have to be solved. And so I simply warned my colleagues that over the next decade or so there might be periods of disappointment this illusion, but because it didn't develop as rapidly as we thought.

[00:46:40] But I'm convinced that this is such an important development, that it will occur over the long-term and we will hit the, a plateau of productivity and that's stating it too mildly. I think in science's case, I think the the plateau, we will hit maybe a as revolutionary as the original open science decision in the 17th century to publish scientific articles.

[00:47:05] I think it's that important.

[00:47:06]Trond Arne Undheim, Host:[00:47:06] I wanted to end on that note. Those are certainly big big thoughts and big developments. I would like to think that we are in that area. We certainly seem to be in an era where there's so much going on that there's suddenly is a job for people like me. Again, who a, who aimed to discuss the future.

[00:47:23] It seemed to me like the last few decades, people have all been futurists and that it's, it has almost seemed like we all knew what was coming and everyone was like a futurist would be out of a job because it was so simple to predict what would happen. But I think COVID, and other things have reminded me and others that the future isn't as easy to predict neither the direction nor the speed.

[00:47:45] And and that's a good reminder. I think.

[00:47:47] George Strawn, Computing policy nestor (guest):[00:47:47] The one thing that I think we can predict and futurists are going to be even more important in this regard is that the rate of things will be in 2100 things will be much more different from 22,000. Then 2000 was for 1900. That is the 21st century will, will be much greater developments and changes in the 20th century was, and we know the 20th century is the greatest changes in the history of mankind.

[00:48:19] Trond Arne Undheim, Host:[00:48:19] Yeah, those are incredible things to ponder. And I wanted to thank you so much for having brought us along for this, a journey, a personal journey through. Science and its future. It was my pleasure. Thank you. You have just listened to episode 85 of the future I's podcast with hosts thrown out in time, futurist and author.

[00:48:45] The topic was the origins and future of open science. In this conversation, we talked about the decisions that turned AARP on it into the global internet on the first ISP via educational institutions. We discussed the rise and fall and rise again, the office of science and technology policy at the white house, the role of the national academies in the us and abroad the path towards open science, with open access and preprint servers, and why big science, publishers resistant George views on the role of science and data in the next.

[00:49:19] Decade. My takeaway is that the origins of open science and the internet were a combination of savvy, futuristic, planning, and surprising twists. And turns the magnitude of the changes have been felt by all the future of open science still looks open-ended. But the promise of bottom-up self-regulation is more alluring than the alternative.

[00:49:40] A regulatory grab to avoid damaging lock in effects. Data is a new business model, but the holders big data become the arbiters of human destiny. Can we achieve George's vision of one computer, one dataset, the implications would be world changing. Thanks for listening. If you'd like to show or in your preferred podcast player and rate us the five stars.

[00:50:06] If you like this topic, you may enjoy other episodes of future, such as episode 84 on the path toward science. 2.0 episode 48 AI in government or episode 29. Future of computational media. Futurizing preparing you to deal with disruption.


George StrawnProfile Photo

George Strawn

Computer policy nestor

Dr. George O. Strawn served as the Director of the Federal Networking and Information Technology Research and Development (NITRD) National Coordination Office (NCO) between November 2009 and June 2015. He also served as the Co-chair of the NITRD Subcommittee of the National Science and Technology Council (NSTC) Committee on Technology (CoT), where he oversaw the operations and activities of the NITRD Program.

Dr. Strawn was on assignment to the NCO from the National Science Foundation (NSF), where
he served as Chief Information Officer (CIO). As the CIO for NSF, he guided the agency in the development and design of innovative information technology, working to enable the NSF staff and the international community of scientists, engineers, and educators to improve business practices and pursue new methods of scientific communication, collaboration, and decision-making.

Prior to his appointment as NSF CIO, Dr. Strawn served as the executive officer of the NSF Directorate for Computer and Information Science and Engineering (CISE) and as Acting Assistant Director for CISE. Previously, Dr. Strawn had served as the Director of the CISE Division of Advanced Networking Infrastructure and Research, where he led NSF’s efforts in the Presidential Next Generation Internet Initiative. During his years at NSF, Dr. Strawn was an active participant in activities of the interagency IT R&D program that is now called NITRD.

Prior to coming to NSF, Dr. Strawn was a Computer Science faculty member at Iowa State University (ISU) for a number of years. He also served there as Director of the ISU Computation Center and Chair of the ISU Computer Science Department. Under his leadership, ISU became a charter member of MIDNET, a regional NSFNET network; he led the creation of a thousand-workstation academic system based on an extension of the MIT Athena system; and the ISU Computer Science department was accredited by the then-new Computer Science Accreditation Board.

Dr. Strawn received his Ph.D. in Mathematics from Iowa State University and his BA Magna Cum Laude in Mathematics and Physics from Cornell College. He was elected as a Fellow of AAAS in October 2012.