Expand the sections below to view information, including abstracts and slides, from past Computing Colloquium speakers
August 24, 2017
August 31, 2017
Dr. Francesca Spezzano – Assistant Professor, Dept. of Computer Science, Boise State University
Ensuring the Integrity of Wikipedia: A Data Science Approach
Wikipedia is the world’s biggest free encyclopedia read by many users every day. Thanks to the mechanism by which anyone can edit, its content grows and is kept constantly updated. However, malicious users can take advantage of this open editing mechanism to seriously compromise the quality of Wikipedia articles. The main form of content damaging is vandalism, defined by Wikipedia itself as “any addition, removal, or change of content, in a deliberate attempt to compromise the integrity of Wikipedia”. Other forms of damaging edits are page spamming and dissemination of false information, e.g. through hoax articles. In this talk, we discuss two research efforts that have the common goal of ensuring the content integrity of Wikipedia and show that our approach significantly beat tools currently running on Wikipedia to detect damaging edits. First, we introduce DePP, the state-of-the-art tool detecting article pages to protect. Page protection is a mechanism used by Wikipedia to place restrictions on the type of users that can make edits to prevent vandalism, libel, or edit wars. Second, we present our work on malicious users identification such as vandals and spammers. Our approach looks at users’ edit behavior and not at edit content, then it can be used independently of the Wikipedia language version.
September 7, 2017
Dr. Robert Lund – Professor, Dept. of Mathematical Sciences, Clemson University
Bayesian Multiple Breakpoint Detection: Mixing Documented and Undocumented Changepoints
This talk presents methods to estimate the number of changepoint time(s) and their locations in time-ordered data sequences when prior information is known about some of the changepoint times. A Bayesian version of a penalized likelihood objective function is developed from minimum description length (MDL) information theory principles. Optimizing the objective function yields estimates of the changepoint number(s) and location time(s). Our MDL penalty depends on where the changepoint(s) lie, but not solely on the total number of changepoints (such as classical AIC and BIC penalties). Specifically, configurations with changepoints that occur relatively closely to one and other are penalized more heavily than sparsely arranged changepoints. The techniques allow for autocorrelation in the observations and mean shifts at each changepoint time. This scenario arises in climate time series where a “metadata” record exists documenting some, but not necessarily all, of station move times and instrumentation changes. Applications to climate time series are presented throughout.
Dr. Robert Lund is Professor of Mathematical Sciences at Clemson University and a Fellow of the American Statistical Association. He obtained his PhD in 1993 from the University of North Carolina and has published over 100 papers, co-authored one book, and supervised 20 PhD students to date. His research interests include Markov chains, statistical climatology, and time series. Presently, Dr. Lund is serving as a NSF Program Director for the Division of Mathematical Sciences.
September 14, 2017
Dr. Natasha Flyer – Scientist III, Institute for Mathematics Applied to the Geosciences, NCAR
Radial Basis Function-Generated Finite Differences (RBF-FD): New Opportunities for Applications in Scientific Computing
Radial Basis Function – generated Finite Differences (RBF-FD), a novel mesh-free method, has the ease of classical FD, yet combines high levels of accuracy with complete geometric flexibility, essential for both local refinement and to accurately handle irregular boundaries/surfaces. Furthermore, algorithmic complexity does not increase with dimension and the method inherently lends itself to short simple codes. It is also highly competitive compared to other state-of-the-art numerical methods. In recent benchmarking tests on the three dominant architectures composing high-performance computing systems today, Intel Multicore, Manycore and Nvidia GPUs, it has demonstrated excellent performance due to the very sparse compact structure of its differentiation matrices. In this presentation, we highlight some recent RBF-FD calculations.
Dr. Natasha Flyer received her Ph.D. from University of Michigan, Ann Arbor. She is a staff scientist at the National Center for Atmospheric Research in Boulder, Colorado. Her research interests focus on the development of computational methods for solar physics and geosciences.
September 21, 2017
J.R. Tietsort – Chief Information Security Officer at Micron
A look at interesting cyber events, why they happen, and how a large enterprise deals with these risks.
J.R. Tietsort is the Chief Information Security Officer for Micron Technology and has been with the company since 1995. In this role, he is responsible for Micron’s enterprise information security program for the protection of intellectual property. He is a Certified Information Security Manager, holds a Bachelor’s degree in Management of Information Systems, and a Master’s degree in Business Administration from Boise State University. Mr. Tietsort is an advisory member of the IT Leadership Roundtable of the Treasure Valley and the Idaho Governor’s Cybersecurity Task Force. He speaks at various events in and outside of Idaho, including the BSU Undergraduate Research Conference. His current focus is on advanced cyber threats, including the prevention of cyber espionage and intellectual property theft.
September 28, 2017
Dr. Scott Baden – UCSD and LBNL
How to Tolerate Communication Costs and How to Reduce Them
I will describe recent work in two complementary projects that each aim to diminish the impact of steadily increasing communication costs on scalable systems. The first project, Toucan, uses domain specific translation to restructure MPI applications to tolerate communication costs. The performance of Toucan’s translated code is competitive with that of manual restructuring. The second is a PGAS library, UPC++, that leverages the GASNet-EX communication library to deliver close to the metal communication performance.
October 5, 2017
Dr. Liljana Babinkostova- Boise State University
Lattice-based Cryptography: Short Integer Solution and Learning with Errors
Sustained advances in automating, interconnecting and miniaturizing technology and securely managing data assets requires a computing basis tuned to efficient use of limited energy and memory resources. Recent advances in quantum computing have triggered widespread interest and an urgent need for a new range of mathematical problems for post-quantum security. These not only present challenging and exciting opportunities for researchers from a wide range of fields, but offer an exceptional opportunity for the future generation of scientists to participate in groundbreaking work and to prepare for new scientific challenges.This talk features current developments and future directions of two problems used in modern lattice-based cryptographic schemes: Short Integer Solution (SIS) and Learning With Errors (LWE).
October 12, 2017
DaRon Huffaker – Data Science & Statistics Manager at Micron
Big Data and Data Science at Micron Technology, Inc
J. DaRon Huffaker manages Micron’s Global Quality Data Science and Statistics Team. He has been a Micron team member for 20+ years. Over the past 3 years, he has formed two data science teams focusing on big data analytics and predictive modeling/machine learning. Prior to that, he has worked on various projects in different departments and positions at Micron: created a production risk based sampling methodology as a Principal Statistician in Operations Central Teams; led global alignment of product critical parameters as an Assembly QA Lead/Engineer; implemented metrology, IQC, and SPC and performed failure analyses as a Transform Solar Senior Quality Engineer; assisted in creating and applying a statistical forecasting process for product demand as a Forecast Manager/Statistician in Supply Chain; formed and managed a QA Statistics Group focused on teaching engineering short courses in applied statistics, setting and monitoring DRAM product burn-in criteria, and providing statistical consulting on numerous projects related to product and process improvements. Mr. Huffaker has a B.Sc. degree in Statistics/Quality Science with an emphasis in Manufacturing Engineering Technology from Brigham Young University and a M.Sc. degree in Applied Statistics from Rochester Institute of Technology. He is a senior member of the American Society for Quality (ASQ) and holds six professional certifications in that organization.
October 19, 2017
Dr. Aaron Halfaker – Senior Research Scientist at Wikimedia Foundation
Engineering at the Intersection of Productive Efficiency, Ideology, and Ethical AI in Wikipedia
Wikipedia has become a dominant source of reference information for more than half a billion people every month. Through its improbable rise to popularity, this “free encyclopedia that anyone can edit” has become a synecdoche for open production communities online. In order to operate at massive scales (~160k edits per day), Wikipedians have embraced algorithmic technologies that bring efficiency and consistency to the wiki’s complex, distributed processes. These algorithms mediate social processes, governance decisions, and editors’ perceptions of each other. Specifically, so-called “black box” artificial intelligences have proven invaluable for supporting curation activities at scale, but they also have the potential to silence voices and perpetuate biases in insidious ways. Despite Wikipedians’ open/audit-able processes, that’s exactly what’s been happening. In this talk, I’ll introduce “ORES,” an open AI platform that is designed to enable Wikipedia’s technologists to enact alternative ideological visions and to enable researchers to easily perform audits. I’ll share some lessons that we’ve learned maintaining a large-scale, generalized AI service and discuss a call to action direct towards critical algorithms researchers to take advantage of this platform for their studies.
October 26, 2017
Dr. Timo Bauman – Systems Scientist, Language and Technologies Institute, School of Computer Science, Carnegie Mellon University
Incremental Language Processing Enables Faster (and Better) Responses
Human speakers and listeners process spoken language incrementally, piece-by-piece and just-in-time, with the different processes involved working concurrently, on different time scales, and with varying degrees of specificity and flexibility. Focusing on incremental speech output in particular, I present applications to interactive use-cases that have led to radically improved human-computer interaction.
November 2, 2017
Dr. Wolfgang Bangerth – Professor, Department of Mathematics, Colorado State University.
Simulating Complex Flows in the Earth Mantle
On long enough time scales, the Earth mantle (the region between the rigid plates at the surface and the liquid metal outer core at depth) behaves like a fluid. While it moves only a few centimeters per year,
the large length scales nevertheless lead to very large Rayleigh numbers and, consequently, very complex and expensive numerical simulations. At the same time, given the inaccessibility of the Earth mantle to direct experimental observation implies that numerical simulation is one of the few available tools to elucidate what exactly is going on in the mantle, how it affects the long-term evolution of Earth’s thermal and chemical structure, as well as what drives and sustains plate motion.
I will here review the approach we have taken in building the state-of-the-art open source solver ASPECT (see http://aspect.geodynamics.org) to simulate realistic conditions in the Earth and other celestial bodies. ASPECT is built using some of the most widely used and best software libraries for common tasks, such as
deal.II for mesh handling and discretization, p4est for parallel partitioning and rebalancing, and Trilinos for linear algebra. In this talk, I will focus on the choices we have made regarding the numerical methods used in ASPECT, and in particular on the interplay between higher order discretizations on adaptive meshes, linear and nonlinear solvers, optimal preconditioners, and approaches to scale to thousands of processor cores. All of these are necessary for simulations that can answer geophysical questions.
November 9, 2017
Dr. Alessandra Scafuro – Assistant Professor, Department of Computer Science, North Carolina State University
TumbleBit: An Untrusted Bitcoin-Compatible Anonymous Payment Hub
Bitcoin was initially conceived as a way for people to exchange money anonymously. Lately, however, it was discovered that it is possible to track Bitcoin transactions and identify the parties
In this talk, I will present TumbleBit, a cryptographic protocol that allows parties to make anonymous Bitcoin payments via an untrusted server, called Tumbler. No-one, not even the
Tumbler, can tell which payer paid which payee during a TumbleBit epoch. TumbleBit consists of two interleaved fair-exchange protocols that prevent theft of bitcoins by cheating users or a
malicious Tumbler. TumbleBit combines fast cryptographic computations (performed off the blockchain) with standard bitcoin scripting functionalities (on the blockchain) that realize smart
November 16, 2017
Dr. Ryan Henry, Assistant Professor of Computer Science in the School of Informatics and Computing at Indiana University
Efficient, Expressive, and Private Information Retrieval from Indexes of Queries
Private information retrieval (PIR) is a cryptographic technique that enables users to fetch records from untrusted and remote database servers without revealing to those servers which particular records are being fetched. This talk will discuss “indexes of queries”, a novel mechanism for supporting efficient, expressive, and information-theoretically private single-round queries over multi-server PIR databases. The indexes of queries approach decouples the way that users construct their requests for data from the physical layout of the remote data store, thereby enabling users to fetch data using “contextual” queries that specify /which/ data they seek, as opposed to the more common “positional” queries that specify /where/ those data happen to reside within the database.
For example, a Twitter-like microblogging service could employ indexes of queries to let users fetch “the 5 most recent tweets by @boisestatecs” or “the 10 top trending tweets for hashtag #boisestate”. Our basic approach is compatible with any PIR protocol in the ubiquitous “vector-matrix” model for PIR, though the most sophisticated and useful of our constructions rely on some nice algebraic properties of Goldberg’s Shamir-based IT-PIR protocol (Oakland 2007). We have implemented our techniques as an extension to Percy++, an open-source implementation of Goldberg’s IT-PIR protocol. Our experiments indicate that the new techniques can greatly improve not only utility for private information retrievers but also efficiency for private information retrievers and servers alike.
November 23, 2017
November 30, 2017
Dr. Bart Knijnenburg – Assistant Professor in Human-Centered Computing, Clemson University
User-Tailored Privacy Decision Support
Privacy issues are an undying obstacle to the real-world implementation of personalized information systems. While there exist several technical privacy-preserving solutions (e.g. client-side personalization, homomorphic encryption, k-anonymity), the concept of privacy is an inherently human attitude associated with the collection, distribution and use of disclosed data, and this disclosure itself is also a human behavior.
This talk discusses one particular human-centric solution to reduce users’ privacy concerns: User-Tailored Privacy. User-Tailored Privacy is an approach to privacy that measures users’ privacy-related characteristics and behaviors, uses this as input to model their privacy preferences, and then provides them with adaptive privacy decision support. In effect, it applies data science as a means to support users’ privacy decisions.
December 8, 2017
Dr. Karin Leiderman – Assistant Professor, Department of Mathematics, Colorado School of Mines
Toward a Computational Model of Hemostasis
Hemostasis is the process by which a blood clot forms to prevent bleeding at a site of injury. The formation time, size and structure of a clot depends on the local hemodynamics and the nature of the injury. Our group has previously developed computational models to study intravascular clot formation, a process confined to the interior of a single vessel. Here we present the first stage of an experimentally-validated, computational model of extravascular clot formation (hemostasis) in which blood through a single vessel initially escapes through a hole in the vessel wall and out a
separate injury channel. This stage of the model consists of a system of partial differential equations that describe platelet aggregation and hemodynamics, solved via the finite element method. We also present results from the analogous, in vitro, microfluidic model. In both models, formation of a blood clot occludes the injury channel and stops flow from escaping while blood in the main vessel retains its fluidity. We discuss the different biochemical and hemodynamic effects on clot formation using distinct geometries representing intra- and extravascular injuries.