The Value of Everything: Making and Taking in the Global Economy

Modern economies reward activities that extract value rather than create it. This must change to insure a capitalism that works for us all.

In this scathing indictment of our current global financial system, The Value of Everything rigorously scrutinizes the way in which economic value has been determined and reveals how the difference between value creation and value extraction has become increasingly blurry. Mariana Mazzucato argues that this blurriness allowed certain actors in the economy to portray themselves as value creators, while in reality they were just moving existing value around or, even worse, destroying it.

The book uses case studies–from Silicon Valley to the financial sector to big pharma–to show how the foggy notions of value create confusion between rents and profits, a difference that distorts the measurements of growth and GDP.

The lesson here is urgent and sobering: to rescue our economy from the next, inevitable crisis and to foster longterm economic growth, we will need to rethink capitalism, rethink the role of public policy and the importance of the public sector, and redefine how we measure value in our society.

Peter Thiel’s New Man In The Defense Department




The new head of defense research and engineering comes from the White House with a relatively light resume.


Updated: 10:20 a.m.

The Pentagon’s new 33-year-old head of research and engineering lacks a basic science degree but brings deep connections to Donald Trump and controversial Silicon Valley venture capitalist Peter Thiel.

Defense officials announced Monday that Michael Kratsios, the White House’s chief technology officer, would serve as acting undersecretary for research and engineering, a post that oversees top-priority projects in hypersonics, quantum computing, microelectronics, and other fields. He will continue to serve in his White House role.

“In seeking to fill this position we wanted someone with experience in identifying and developing new technologies and working closely with a wide range of industry partners,” said Defense Secretary Mark Esper in a statement on Monday. “We think Michael is the right person for this job and we are excited to have him on the team.”

Kratsios came to the White House in 2017 as deputy CTO, and moved up to CTO last year. He led efforts to further White House investment in artificial intelligence and quantum science and to expand U.S. partnerships in those areas. As the COVID-19 pandemic took hold, he helped launch a project to apply U.S. supercomputers to the U.S response. 

But Kratsios was a “weird pick” for these senior technical roles, according to one person who has served as both a senior White House and Defense Department official advising on technology issues. 

Kratsios graduated from Princeton with a bachelor’s degree in political science and a focus on ancient Greek democracy. The person he’s replacing, Michael Griffin, holds a Ph.D. in aerospace engineering and served as a NASA administrator. Indeed, Kratsios will be less academically credentialled than most of the program-managers he oversees. So how did he get here?

After Princeton, he went to work for Peter Thiel, soon becoming CFO of Clarium Capital Management, Thiel’s investment company. He then became  “chief of staff” for the tech billionaire, who was an early backer of the Trump campaign and who has played a key role in the administration’s approach to technology. 

Thiel-backed ventures like Anduril and Palantir are playing a growing role in the Defense Department. The former official said the overlap between Thiel-backed defense contractors and his protege Kratsios need not be a cause for concern. The Department has spent years trying to improve its relationship with the private tech world from which Kratsios emerged. But the official said Kratsios might not prove to be the most effective ambassador.

“It’s not clear to me that Kratsios is warming up Silicon Valley,” the former official said. “I don’t know how the rest of Silicon Valley thinks of Kratsios.” 

Thiel has made a variety of enemies in the tech world and beyond; for example, he has slammed Google as being too accommodating to China. 

The development, however, is good news for “the Peter Thiel portion of Silicon Valley,” the former official said.

Tempering Expectations for GPT-3 and OpenAI’s API

On May 29th, OpenAI released a paper on GPT-3, their next iteration of Transformers-based text generation neural networks. Most notably, the new model has 175 billion parameters compared to the 1.5 billion of previous GPT-2 iteration: a 117x increase in model size! Because GPT-3 is so large, it can’t be run on conventional computers, and it only became publicly available as a part of the OpenAI API, which entered an invite-only beta soon after the paper was released and will be released for-profit sometime later.

The API allows you to programmatically provide GPT-3 with a prompt, and return the resulting AI-generated text. For example, you could invoke the API with:

curl https://api.openai.com/v1/engines/davinci/completions 
-H "Content-Type: application/json" 
-H "Authorization: Bearer <SECRET_KEY>" 
-d '{"prompt": "This is a test", "max_tokens": 5}'

And get this back from the API, where the text is the generated text following up from the prompt:

{
    "id": "cmpl-<ID>",
    "object": "text_completion",
    "created": 1586839808,
    "model": "davinci:2020-05-03",
    "choices": [{
        "text": " of reading speed. You",
        "index": 0,
        "logprobs": null,
        "finish_reason": "length"
    }]
}

As someone who has spent a very large amount of time working with GPT-2 while developing tools such as gpt-2-simple and aitextgen, which allow for optimized text generation using GPT-2, I was eager to test for myself if the quality of text generated from GPT-3 was really that much better. Thanks to OpenAI, I got invited to the beta, and with permission, I released a GitHub repository with a Python script to query the API, along with many examples of text prompts and their outputs. A fun use case for GPT-3 is absurdism, such as prompting the model about unicorns speaking English, with the model prompt bolded:

I also fed my own tweets through GPT-3 and curated the output, resulting in data science one-liners that are wholly original:

There hadn’t been too much GPT-3 hype after the initial announcement, outside of a few blogs from Gwern and Kevin Lacker. Until a viral tweet by Sharif Shameem showed what GPT-3 can really do:

Later, he made a followup tweet generating React code with GPT-3:

That demo got the attention of venture capitalists. And when a cool-looking magical thing gets the attention of venture capitalists, discourse tends to spiral out of control. Now, there are many tweets about GPT-3, and what it can do from others who have gained access to the API.

Hype aside, let’s look at the pragmatic realities of the model. GPT-3 is indeed a large step forward for AI text-generation, but there are very many caveats with the popular demos and use cases that must be addressed.

An Overview of GPT-3

GPT-3 itself, like most neural network models, is a black box where it’s impossible to see why it makes its decisions, so let’s think about GPT-3 in terms of inputs and outputs.

Actually, why not let GPT-3 tell its own story? Hey GPT-3, how do you work?

Close, but not quite!

In layman’s terms, text generating models such as GPT-3 generate text by taking supplied chunks of text from a prompt and predicting the next chunk of text, with an optional temperature parameter to allow the model to make suboptimal predictions and therefore be more “creative”. Then the model makes another prediction from the previous chunks including the new chunk, and repeats until it hits a specified length or a token that tells the model to stop generating. It’s not very philosophical, or evidence of some sort of anthropomorphic consciousness.

GPT-3 has two notable improvements from GPT-2 aside from its size: it allows generation of text twice the length of GPT-2 (about 10 paragraphs of English text total), and the prompts to the model better steer the generation of the text toward the desired domain (due to few-shot learning). For example, if you prompt the model with an example of React code, and then tell it to generate more React code, you’ll get much better results than if you gave it the simple prompt.

Therefore, there are two high-level use cases for GPT-2: the creative use case for fun text generation at high temperature, as GPT-2 once was, and the functional use case, for specific NLP-based use cases such as webpage mockups, with a temperature of 0.0.

GPT-3 was trained on a massive amount of text from all over the internet as of October 2019 (e.g. it does not know about COVID-19), and therefore it has likely seen every type of text possible, from code, to movie scripts, to tweets. A common misconception among viewers of GPT-3 demos is that the model is trained on a new dataset; that’s not currently the case, it’s just that good at extrapolation. As an example, despite the Star Wars: Episode III – Revenge of the Sith prompt containing text from a single scene, the 0.7 temperature generation imputes characters and lines of dialogue from much further into the movie. (The largest GPT-2 model could do that, but nowhere near as robust)

The real metagame with GPT-3 is engineering and optimizing complex prompts which can reliably coerce outputs into what you want. And with that brings a whole host of complexity and concerns.

GPT-3 Caveats

Despite everything above, I don’t believe that GPT-3 is a new paradigm or an advanced technology indistinguishable from magic. GPT-3 and the OpenAI API showcases on social media don’t show potential pitfalls with the model and the API.

Hey GPT-3, what problems do you have?

Sorry GPT-3, but I am a mean person.

Model Latency

If you’ve seen the demo videos, the model is slow, and it can take awhile for output to show up, and in the meantime the user is unsure if the model is broken or not. (There is a feature to allow streaming the model outputs as they are generated, which helps in creative cases but not in functional cases).

I don’t blame OpenAI for the slowness. A 175 billion parameter model is a model that’s wayyy too big to fit on a GPU for deployment. No one knows how GPT-3 is actually deployed on OpenAI’s servers, and how much it can scale.

But the fact remains; if the model is too slow on the user end, it results in a bad user experience and might drive people away from GPT-3 and just do things themselves (e.g. Apple’s Siri for iOS, where requests can take forever if there is a weak internet connection and you just give up and do it yourself).

Selection Bias Toward Good Examples

The demos for GPT-3 are creative and human-like, but like all text generation demos, they unintentionally imply that all AI-generated output will be that good. Unfortunately, that’s not the case in reality; AI-generated text has a tendency to fall into an uncanny valley, and good examples in showcases are often cherry-picked.

That said, from my experiments, GPT-3 is far better in terms of the average quality of generated text than other text-generation models, although it still does depend on the generation domain. When I was curating my generated tweets, I estimated 30-40% of the tweets were usable comedically, a massive improvement over the 5-10% usability from my GPT-2 tweet generation.

However, a 30-40% success rate implies a 60-70% failure rate, which is patently unsuitable for a production application. If it takes seconds to generate a React component and it takes on average 3 tries to get something usable, it might be more pragmatic to just create the component the hard, boring way. Compare again to Apple’s Siri, which can get very frustrating when it performs the wrong action.

Everyone Has The Same Model

The core GPT-3 model from the OpenAI API is the 175B parameter davinci model. The GPT-3 demos on social media often hide the prompt, allowing for some mystique. However, because everyone has the same model and you can’t build your own GPT-3 model, there’s no competitive advantage. GPT-3 seed prompts can be reverse-engineered, which may become a rude awakening for entrepreneurs and the venture capitalists who fund them.

Corporate machine learning models are often distinguished from those from other companies in the same field through their training on private, proprietary data and bespoke model optimization for a given use case. However, OpenAI CTO Greg Brockman hinted that the API will be adding a finetuning feature later in July, which could help solve this problem.

Racist and Sexist Outputs

The Web UI for the OpenAI API has a noteworthy warning:

Please use your judgement and discretion before posting API outputs on social media. You are interacting with the raw model, which means we do not filter out biased or negative responses. With great power comes great responsibility.

This is a reference to the FAQ for the API:

Mitigating negative effects such as harmful bias is a hard, industry-wide issue that is extremely important. Ultimately, our API models do exhibit biases (as shown in the GPT-3 paper) that will appear on occasion in generated text. Our API models could also cause harm in ways that we haven’t thought of yet.

After the launch of the API, NVIDIA researcher Anima Anandkumar made a highly-debated tweet:

During my GPT-3 experiments, I found that generating tweets from @dril (admittingly an edgy Twitter user) ended up resulting in 4chan-level racism/sexism that I spent enormous amounts of time sanitizing, and it became more apparent at higher temperatures. It’s especially important to avoid putting offensive content for generated texts which put words in others’ mouths.

Jerome Pesenti, the head of AI at Facebook, also managed to trigger anti-semetic tweets from a GPT-3 app:

Again, it depends on the domain. Would GPT-3 output racist or sexist React components? Likely not, but it’s something that would still need to be robustly checked. OpenAI does appear to take these concerns seriously, and has implemented toxicity detectors for generated content in the Web UI, although not the programmatic API yet.

Further Questions about the OpenAI API

AI model-as-a-service is an industry that tends to be a black box wrapped around another black box. Despite all the caveats, everything depends on how the OpenAI API exits beta and rolls out the API for production use. There are too many unknowns to even think about making money off of the OpenAI API, let alone making a startup based on it.

At minimum, anyone using the OpenAI API professionally needs to know:

  • Cost for generation per token/request
  • Rate limits and max number of concurrent requests
  • Average and peak latencies for generating tokens
  • SLA for the API
  • AI generated content ownership/copyright

That’s certainly less magical!

The most important question mark there is cost: given the model size, I’m not expecting it to be cheap, and it’s entirely possible that the unit economics make most GPT-3-based startups infeasible.

That said, it’s still good for people to experiment with GPT-3 and the OpenAI API in order to show what the model is truly capable of. It won’t replace software engineering jobs anytime soon, or become Skynet, or whatever. But it’s objectively a step forward in the field of AI text-generation.

What about GPT-2? Since it’s unlikely that the other GPT-3 models will be open-sourced by OpenAI, GPT-2 isn’t obsolete, and there will still be demand for a more open text-generating model. However, I confess that the success of GPT-3 has demotivated me to continue working on my own GPT-2 projects, especially since they will now be impossible to market competitively (GPT-2 is a number less than GPT-3 after all).

All said, I’d be glad to use GPT-3 and the OpenAI API for both personal and professional projects once it’s out of beta, given that the terms of use for the API are reasonable. And if the hype becomes more leveled such that said projects can actually stand out.

If you liked this blog post, I have set up a
Patreon to fund my machine learning/deep learning/software/hardware needs for my future crazy yet cool projects, and any monetary contributions to the Patreon are appreciated and will be put to good creative use.

One Thing You Can’t Do in IKEA

For many adults, IKEA is like a playground. The furniture retailer sells thousands of items designed to be assembled at home by the customer, and each item — from beds and nightstands to bookshelves and desks — follows the same, clean, modern design. For shoppers, going up and down its aisles can be an adventure in its own right, as one’s imagination can take over with all the things you can build. Many stores have restaurants and even day cares, allowing adults to spend the better part of a day browsing (and ultimately, spending). Oh, and the build-it-yourself model also makes their products a little less expensive than the alternative.

As a result, the company has seen significant success over the last two decades. Today, IKEA is enormous — it employs more than 200,000 people, has more than 400 stores in more than fifty countries, and did more than $40 billion in revenue in 2019. And so is its stores. Per some estimates, an average IKEA runs 300,000 square feet (about 27,000 square meters), or about the size of five football fields or four soccer pitches. And with all that space, and all those aisle, and all those grown-ups looking to have a fun afternoon anyway, IKEA seems like a great place to play hide-and-seek, right?

Right.

But IKEA really, really doesn’t want you to.

The problems started in the summer of 2014. According to Bloomberg, “a spirited round of the children’s game attracted hundreds of people to a Belgian Ikea outlet” that year, and at first, IKEA was not only okay with it, but they helped out. As Fast Company explains, “a Belgian blogger named Elise De Rijck coordinated a hide-and-seek meet-up at her local Wilrijk store to celebrate her 30th birthday. She created a Facebook group and invited her friends—but soon, thousands of people had joined the group. Ikea Belgium got wind of the plan and instead of squashing it, offered Ikea’s full support, including extra staff and security to host the event.” Participants hid virtually eveywhere; per Bloomberg, “people were hiding in fridges, under stuffed toys, under Ikea’s blue shopping bags and even in the storage space under beds.”

But while IKEA saw this as a silly one-time event, other hide-and-seek fans wanted their chance. Other fans of the store began organizing massive games via Facebook, with numbers rivaling attendance at major sporting events. As Time reported, in early 2015, “32,000 people signed up for a Facebook event in Eindhoven; 19,000 in Amsterdam; and 12,000 in Utrecht.”  News of the planned events spread widely, as evidenced by the number of RSVPs, and IKEA execs took notice — and action. Working with Facebook, IKEA had the events removed, effectively ending the dreams of the thousands of hiders and seekers.

Today, IKEA’s policy is clear — you can’t play hide-and-seek in its stores. While IKEA management probably won’t arrest your nine-year-old for ducking under some bed covers, they’ll definitely call the police if they get wind of a massive game being organized. In 2019, hiders and seekers tried to start a 3,000 person game in Glasgow, but ended up five seekers they weren’t counting on — the police. Per the New York Post, “five cops stayed at the shop for the entire day on Aug. 31 to gauge whether folks were browsing for a cheap desk — or actually hunting the perfect spot to hide.” Per the Post, no one was arrested.

 
Bonus fact
: IKEA’s product names are in Swedish, and sometimes, that goes wrong. In 2004, the retailer introduced a new workbench for children, but you won’t find the product on their catalog any longer. The desk was named for the Swedish phrase for “speedy” or “full speed,” which, unfortunately, became “FartFull.”

From the Archives: Extreme Tag: Another playground game, but played by adults.

Microsoft Analyzed Data on Its Newly Remote Workforce

Photography by Coolcaesar

Teams that don’t communicate. Market disruption. Unidentified logjams. Employee burnout. Lost efficiency. As part of a group of data scientists, management consultants, and engineers at Microsoft, we help companies harness behavioral data to measure and solve these kinds of challenges — the kinds that firms feel but usually cannot see.

Four months ago we realized that our company, like so many others, was undergoing an immediate and unplanned shift to remote work. We all scrambled to set up home offices, situate newly homeschooled kids, juggle customer calls and cat antics, and, in many ways, rethink how to do our jobs.

At the same time, we knew this was a rare, real-time opportunity to learn something about work itself. We wanted to study how flexible and adaptable it might or might not be, how collaboration and networks morph in remote settings, what agility looks like in different spaces. Maybe most important, we wanted to know how to nurture and improve employee well-being during times of crisis.

So, we launched an experiment to measure how the work patterns across our group were changing, using Workplace Analytics, which measures everyday work in Microsoft 365, and anonymous sentiment surveys. We didn’t know what we’d find, but we felt certain that it would help us, our partners, our customers, and other organizations navigate the phases of this shift.

Our research started from a place of deep empathy for our colleagues and great curiosity about their capacity to adapt. We had few hypotheses but many burning questions, such as:

  • How will employees integrate — and separate — work and home life under the same roof?
  • Will we be able to maintain our relationships and networks without our typical face-to-face connections?
  • Will we collaborate differently in order to get our work done?
  • How will managers support and engage fully remote teams?

Because behaviors in a given scenario are responses to a host of factors, there’s no way to predict the impact an unexpected disruption or crisis will have on how people work. This is the value of measuring it in real time: You can truly see how employees — and, as an extension, a company’s culture — react and adapt. The results might come as a surprise, be counterintuitive, or reveal problems to address and positive trends to replicate. We experienced all three.

Uncovering What Has Changed About Work

For this research, we measured collaboration patterns across our 350-person Modern Workplace Transformation team, based largely in the U.S., as well as other groups within Microsoft. We looked weekly at areas such as work-life balance and collaboration by analyzing aggregated, de-identified email, calendar, and IM metadata; comparing it with metadata from a prior time period; and inviting colleagues to share their thoughts and feelings. Often, we were able to find context for the data within the lived experiences of our team. For instance, our research revealed that workdays were lengthening — people were “on” four more hours a week, on average. Our survey shed light on one possible explanation: Employees said they were carving out pockets of personal time to care for children, grab some fresh air or exercise, and walk the dog. To accommodate these breaks, people were likely signing into work earlier and signing off later.

Our findings aren’t all necessarily good or bad; many have given us a nuanced view into how people are adapting to new demands. For example, while Microsoft salespeople have significantly increased their collaboration time with customers, people in our manufacturing group have focused on streamlining and optimizing connection points with a growing number of supplier contacts. Some revelations turned out to be clear bright spots, like the fact that multitasking during meetings didn’t spike even though people weren’t in the same room. Other insights, like signs of blurring work-life boundaries, are indicators we want to learn more about going forward. And we’re still digging into the short- and long-term impact of some of the changes — such as the repurposing of commute time for meetings — on our employees, teams, and organization.

Our most fascinating findings can be grouped into a handful of big themes:

When driven by employees, entrenched norms can change quickly. Measuring collaboration patterns across our 350-person team, we looked at how meetings had changed beyond remote-only attendance. One data point stunned us: the rise of the 30-minute meeting (cue Star Wars music).

While weekly meeting time increased by 10% overall — we could no longer catch up in hallways or by the coffee machine, so we were scheduling more connections — individual meetings actually shrank in duration. We had 22% more meetings of 30 minutes or less and 11% fewer meetings of more than one hour.

This was surprising. In recent decades meetings have generally gotten longer, and research shows it has had a negative effect on employee productivity and happiness. Our flip to shorter meetings had come about organically, not from any management mandate. And according to our sentiment survey, the change was appreciated. Suddenly the specter of an hour-long meeting seemed to demand more scrutiny. (Does it really need to be that long? Is this a wise use of everyone’s time?) This is one of the many ways that the remote-work period could have a long-term impact.

Managers get soaked, but they also carry the life preservers. Through multiple indicators, we learned that managers are bearing the brunt of the shift to remote work. Senior managers are collaborating eight-plus more hours per week. In China, where offices closed weeks earlier than in other countries, and where we measured impacts beyond our 350-person team, our manager colleagues saw Microsoft Teams calls double, from seven hours a week to 14 hours a week. Working to support employees, nurture connections, and manage dispersed teams from home, managers sent 115% more IMs in March, compared with 50% more for individual contributors.

At the same time, managers are enabling employee resilience through this disruption. Employees across our team saw their work hours spike after the shift. But looking at one-on-one meetings, a key connection point in the manager-employee relationship, we found that the employees who averaged the most weekly one-on-one time with their managers experienced the smallest increase in working hours. In short, managers were buffering employees against the negative aspects of the change by helping them prioritize and protect their time.

This data supports consistent themes we heard from our people: “My manager increased the frequency of one-on-ones. They have been a great way to stay aligned and, especially at the beginning, navigate the shift effectively,” one employee reported. “The challenges of this time helped me understand the need to get to know my employees better and focus my efforts on their goals,” said one manager.

Manager One-on-Ones Can Help Contain Longer WFH Hours  A bar chart shows that Microsoft employees whose managers checked in with them more often during the early stages of the Covid-19 crisis saw less of an increase in both collaboration hours and general working hours than employees whose managers didn’t. Workers who received 30 minutes of weekly one-on-one time collaborated around 1.75 more hours and worked around 1.5 more hours per week, on average. Workers who received only 15 minutes of weekly one-on-one time collaborated around 3.75 more hours and worked around 3 more hours per week, on average.  Source: Microsoft Workplace Analytics

It doesn’t take much for workplace culture to start to shift. The data we looked at has allowed us to quantify how the rhythms of the workday have changed across our team. For example, just a few months ago many of us could not have imagined spending our commute time anywhere but in our commute. We were used to meetings concentrated in the mornings, breaks at lunchtime, focused work in the afternoons, and a transition back to our personal lives at the end of each day.

When disruption came calling, we found that flexibility was close behind. Most of our team shifted meetings away from the 8 AM to 11 AM window and toward the 3 PM to 6 PM window. As our days became fragmented (“like Swiss cheese,” one employee put it), with more meetings and personal responsibilities to juggle, we leaned on flexibility. Working in pockets helped, but sometimes we found that job demands rushed in to fill spaces previously reserved for personal downtime:

  • Before the crisis, we typically saw a 25% reduction in instant messaging during the lunch hour, but now that reduction is down to 10%.
  • A new “night shift” has taken root, which employees are using to catch up on work — and not only focused individual work. The share of IMs sent between 6 PM and midnight has increased by 52%.
  • Employees who had well-protected weekends suddenly have blurrier work-life boundaries. The 10% of employees who previously had the least weekend collaboration — less than 10 minutes — saw that amount triple within a month.

Lunchtime and Evenings Aren’t a Break from Screens  A line chart shows how the Covid-19 crisis has changed how much instant messaging Microsoft employees do during their lunch hour. Before the crisis, typically there was a 25% reduction in instant messaging during lunch. When working from home became the norm, the reduction dropped to only 10%.  Source: Microsoft Workplace Analytics

Some of the changes we measured might seem inconsequential on their own. But taken together, they reveal a shift in our work culture that was neither intended nor wanted. We will continue to closely monitor these trends.

Human connection matters a lot, and people find a way to get it. We know that belonging is a core human need and that feeling a sense of connection is an intrinsic motivator. This is why work relationships are so important — strong social connections help employees feel happier and healthier and build stronger networks.

On our team, around Microsoft, and across many of our customers’ companies, a trend cropped up very quickly after the shift to remote work: virtual social meetings. Responding to the lack of natural touchpoints — grabbing lunch in the cafeteria, popping by someone’s desk — employees found new ones. In our group, these ranged from group lunches to happy hours with themes such as “pajama day” and “meet my pet.” Overall, social meetings went up 10% in a month.

At the same time, scheduled one-on-ones among employees went up 18%, showing that people would sooner add meetings to their schedules than lose connections.

We also measured networks across more than 90,000 Microsoft employees in the United States. Frankly, we expected to see them shrink significantly, given the rapid shifts in environment, daytime rhythms, and personal responsibilities. Instead, we discovered that most employees maintained their existing connections. Even more encouraging, most people’s network size increased. We had assumed that in a time of crisis, employees might strengthen networks within their own work groups in an insular way. In fact, we saw network growth not only within existing work groups but also across different groups, indicating that to adapt and thrive teams sought to build bridges.

Understanding the shifts in people’s behaviors and in business as usual was only the first step. The next one — trickier and equally critical — is to figure out which changes we should actively address and course-correct, even as the ground beneath our organization continues to shift. We’ve heard many of our customers express a desire to focus energy on building innovative, resilient frameworks for the future. Fortunately, we know from research on the “fresh start effect” that now is a perfect time to carefully and deliberately reshape our work culture.

In our experience, organizations and leaders who successfully seed change are those who choose to tackle a small number of challenges, maybe even just one, rather than opening up their whole culture to be reimagined. The challenges they choose tend to be the problematic norms that pose the most risk to employee well-being, business continuity, and customer focus. Within Microsoft and among our customers and partners, we’ve seen groups respond to recent behavioral shifts by normalizing manager one-on-ones to help employees gain clarity and connection, increasing small-group meetings to combat the isolation of remote work, and reducing late-night instant messaging to address burnout. One of our customers is using our data to understand which teams are navigating remote work really well, as part of planning for a possible two-year work-from-home scenario.

Our organization and others within Microsoft are also trying creative tactics to support engagement and productivity and to better integrate work and life. One product engineering team has committed to “Recharge Fridays” — days free of meetings so that employees can focus. As an antidote to the “always on” triage mode of remote work, some teams have been intentional about encouraging employees to use their vacation time to unplug and relax. The thread connecting the most successful of these interventions is that they focus on mindset rather than outcomes. In other words, we asked why people aren’t able to focus, recognized that it’s because free space is too often filled by meetings, and then collectively decided to eliminate meetings on certain days altogether.

As our company and many others plan for what comes next, we’re adjusting the focus of our research to the changes that will be needed to continue supporting organizational health and business continuity. These changes include new processes and policies, tooling and workspaces, collaboration norms, and employee wellness resources. We know the future will be increasingly digital, flexible, and remote-friendly, or even remote-first. And as organizations across the globe shift back to the office, measuring patterns of work against a baseline and keeping an eye on how people adapt will be essential — especially if new waves of disruption bring new unknowns. For example, our colleagues in China, who have already moved large parts of their workforces back to the office, are seeing that some of the habits that emerged during remote work, such as more reliance on instant messaging and longer workweeks, have continued even after the return.

Is work today permanently different from what it was before Covid-19 and the work-from-home shift? We don’t know yet, but the data can give us ongoing, real-time information that we can use to influence what happens next. We believe that what we learn about these changes will be key to organizational resiliency in the months and years to come.The Big Idea

The authors would like to thank Abhinav Singh, Microsoft Workplace Intelligence associate, for contributing to this report.

About the authors: Natalie Singer-Velush is a marketing communications manager and the editor of Microsoft Workplace Insights. She creates thought leadership about people analytics, behavioral science, the future of work, and the power of data to help organizations and people innovate, evolve, and succeed. As a former journalist and an MFA in a sea of MBAs, she thrives on bringing data to life through storytelling. Kevin Sherman is a director on the Workplace Analytics team at Microsoft. He leads a team of experts who apply storytelling, behavioral science, and corporate strategy to product strategy and customer delivery. He’s also been leading Microsoft’s own use of Workplace Analytics since its acquisition of VoloMetrix in 2015. Erik Anderson is a director on Microsoft’s workplace intelligence team. He has spent his career building teams and methods for data-driven problem-solving and transformation, and he currently leads a team that helps customers harness the power of their collaboration data.

Fixing Mass Effect black blobs on modern AMD CPUs

TL;DR – if you are not interested in an in-depth overview of what was wrong with the game and how it was fixed,
scroll down to Download section for a download link.



Mass Effect is a popular franchise of sci-fi roleplaying games. The first game was initially released by BioWare in late 2007 on Xbox 360 exclusively as a part of a publishing deal with Microsoft.
A few months later in mid-2008, the game received PC port developed by Demiurge Studios. It was a decent port with no obvious flaws, that is until 2011 when AMD released their new Bulldozer-based CPUs.
When playing the game on PCs with modern AMD processors, two areas in the game (Noveria and Ilos) show severe graphical artifacts:


Well, that doesn’t look nice.

While not unplayable, it’s definitely distracting. Thankfully, workarounds exist – such as
disabling lighting via console commands
or modifying the game’s maps to remove broken lights, but seemingly the issue has never been fully understood.
Some sources claim that an FPS Counter mod can also fix that issue, but I couldn’t find much information about it and the mod’s sources don’t seem to be available online,
and there is no documentation on how the mod tackles this error.

What makes this issue particularly interesting? Vendor-specific bugs are nothing new, and games have had them for decades. However, to my best knowledge, this is the only case where a graphical
issue is caused by a processor and not by a graphics card. In the majority of cases, issues happen with a specific vendor of GPU and they don’t care about the CPU, while in this case, it’s the exact opposite.
This makes the issue very unique and worth looking into.

Looking up existing discussions online, this issue seems to affect AMD FX and Ryzen chips. Compared to the older AMD chips, these lack a 3DNow! instruction set.
Unrelated or not, the community consensus was that this was the cause of the bug and that the game tried to use those instructions upon detecting an AMD CPU.
Given that there are no known cases of this bug occurring on Intel CPU’s and that 3DNow! instructions were exclusive to AMD, it’s no surprise the community assumed that this is the issue.

Is this really the issue, or is it caused by something entirely different? Let’s find out!

Prelude

Even though the issue is trivial to reproduce, I couldn’t look into it for the longest time for a simple reason – I don’t have access to any PCs with AMD hardware!
Thankfully, this time I’m not approaching research alone – Rafael Rivera got my back during the entire process of R&D,
providing a test environment with an AMD chip, insights, ideas as well as putting up with hundreds of blind guesses I usually throw around when trying to find the way to the root of such unknown problems.

Since we now had a good testing environment, the first theory to test was of course cpuid – if people are right in assuming that 3DNow! instructions are to blame, there should a place in the game’s code
where they check for their presence, or at the very least check for the CPU vendor. That reasoning is flawed, though; if it was true that the game attempts to use 3DNow! instructions any time it runs on an AMD chip,
without checking if they are supported, the game would most likely crash when trying to execute an illegal instruction. Moreover, a quick scan around the game’s code reveals that the game doesn’t
check for CPU capabilities. Therefore, whatever is up with this issue, it doesn’t appear to be caused by the game mis-detecting CPU features, because it seemingly doesn’t care about them in the first place.

When this started looking like an undebuggable case, Rafael came back to me with a realization – disabling PSGP (Processor Specific Graphics Pipeline) fixes the issue and the characters are properly lit!
PSGP is not the best documented term, but in short, it’s a legacy (concerning only older DirectX versions) feature allowing Direct3D to perform processor-specific optimizations:

In previous versions of DirectX, there was a path that allowed to do vertex processing called the PSGP. Applications had to take this path into account and support a path for vertex processing
on the processor and graphics cores.

Putting it this way, it makes sense why disabling PSGP fixes artifacts on AMD – the path taken by modern AMD processors may be somehow broken.
How to disable it? Two ways come to mind:

  • It is possible to pass a D3DCREATE_DISABLE_PSGP_THREADING flag to IDirect3D9::CreateDevice. It’s defined as:

    Restrict computation to the main application thread. If the flag is not set, the runtime may perform software vertex processing and other computations in worker thread
    to improve performance on multi-processor systems.

    Sadly, setting that flag doesn’t fix the issue. Looks like, despite the flag having “PSGP” in name, it’s not what we are looking for.

  • DirectX specifies two registry entries to disable PSGP in D3D and to disable PSGP only for D3DX – DisablePSGP and DisableD3DXPSGP. Those flags can be set system-wide or process-wide.
    For information on how to set them only for a specific process, see Rafael Rivera’s guide on enabling application-specific Direct3D flags.

DisableD3DXPSGP appears to be a viable fix for that issue. Therefore, if you have an aversion towards downloading third party fixes/modifications or you must fix this issue without making
any changes to the game, it’s a perfectly fine way of doing it. As long as you set that flag only for Mass Effect and not system-wide, it’s fine!

PIX

As always with graphical issues, PIX is likely the most useful tool one could use to diagnose them. We captured similar scenes from Intel and AMD hardware and compared the results.
One difference was instantly noticeable – unlike with my past projects, where captures did not carry the bug with them and the same capture
would look different on different PCs (indicating a driver or d3d9.dll bug), these captures carry the bug with them! In other words, a capture from an AMD hardware opened on a PC with Intel hardware
does show the bug.

An AMD capture on Intel looks no different than on the hardware it was taken from:

What does this tell us?

  • Since PIX does not “take screenshots” but instead captures the sequence of D3D commands and executes them on hardware, we can observe that executing the commands captured from an AMD box
    results in the same bug when executed on Intel.
  • This strongly implies that the difference is not caused by the difference in how the commands are executed (that’s how you get GPU specific bugs), but what commands are executed.

In other words, it’s almost certainly not any sort of a driver bug. Instead, the way inputs for the GPU are prepared seems to be somehow broken. That is indeed a very rare occurrence!

At this point, finding the bug is a matter of finding any jarring differences between captures. It’s tedious, but that’s the only viable way.

After a long while spent poking the capture, a full body draw call caught my attention:

On an Intel capture, this draw outputs most of the character’s body, together with lighting and textures. On an AMD capture, it outputs a plain black model. This looks like a good trail.

The first obvious candidate for checking would be bound textures, but they seem to be fine and are consistent across captures.
However, some of the pixel shader constants looked weird. Not only do they have NaNs (Not a Number), but they also seem to only appear on the AMD capture and not the Intel capture:


1.#QO indicates a NaN

This looks promising – NaN values causing strange visuals are not unheard of. Funnily enough, a PlayStation 3 version of Mass Effect 2
had a very similar looking issue in RPCS3 which was also related to NaNs!

However, before we get too excited, those values could just be leftovers from previous draws and they might end up being unused for the current draw.
Luckily, in this case it’s clearly visible that those NaNs get submitted to D3D for this specific draw…

49652	IDirect3DDevice9::SetVertexShaderConstantF(230, 0x3017FC90, 4)
49653	IDirect3DDevice9::SetVertexShaderConstantF(234, 0x3017FCD0, 3)
49654	IDirect3DDevice9::SetPixelShaderConstantF(10, 0x3017F9D4, 1) // Submits constant c10
49655	IDirect3DDevice9::SetPixelShaderConstantF(11, 0x3017F9C4, 1) // Submits constant c11
49656	IDirect3DDevice9::SetRenderState(D3DRS_FILLMODE, D3DFILL_SOLID)
49657	IDirect3DDevice9::SetRenderState(D3DRS_CULLMODE, D3DCULL_CW)
49658	IDirect3DDevice9::SetRenderState(D3DRS_DEPTHBIAS, 0.000f)
49659	IDirect3DDevice9::SetRenderState(D3DRS_SLOPESCALEDEPTHBIAS, 0.000f)
49660	IDirect3DDevice9::TestCooperativeLevel()
49661	IDirect3DDevice9::SetIndices(0x296A5770)
49662	IDirect3DDevice9::DrawIndexedPrimitive(D3DPT_TRIANGLELIST, 0, 0, 2225, 0, 3484) // Draws the character model

…and the pixel shader used for this draw references both constants:

// Registers:
//
//   Name                     Reg   Size
//   ------------------------ ----- ----
//   UpperSkyColor            c10      1
//   LowerSkyColor            c11      1

Both constants appear to come straight from Unreal Engine and judging by the name,
they might directly influence the lighting. Bingo!

A quick in-game test further confirms the theory – on an Intel machine, a vector of 4 NaN values was never submitted as pixel shader constants;
meanwhile, on an AMD machine, NaNs would start showing up as soon as the player entered the area where lighting breaks!

Does it mean work is done? No, far from it, as finding broken constants is only half of the success. The question remains, where do they come from, and can they be replaced?
An in-game test replacing NaN values with zeros partially fixed the issue – ugly black blobs disappeared, but characters were still way too dark:


Almost correct… but not quite.

Given how important these light values might be for the scene, it’s not feasible to settle with a workaround like this. We know we are on the right track though!

Sadly, any attempt to track down the origin of these constants pointed towards something resembling a render thread and not the real place of submission.
While not undebuggable, it’s clear that we needed to try a fresh approach before potentially spending an infinite amount of time following the data flow between game-specific
and/or UE3-specific structures.

Taking a step back, we realized that we overlooked something earlier on. Recall that to “fix” the issue, one of two registry entries had to be added – DisablePSGP and DisableD3DXPSGP.
Assuming their naming is not misleading, then DisableD3DXPSGP should be a subset of DisablePSGP, with the former disabling PSGP in D3DX only, and the latter disabling it in both D3DX and D3D.
With this assumption, we turned our sights to D3DX.

Mass Effect imports a set of D3DX functions by linking against d3dx9_31.dll:

D3DXUVAtlasCreate
D3DXMatrixInverse
D3DXWeldVertices
D3DXSimplifyMesh
D3DXDebugMute
D3DXCleanMesh
D3DXDisassembleShader
D3DXCompileShader
D3DXAssembleShader
D3DXLoadSurfaceFromMemory
D3DXPreprocessShader
D3DXCreateMesh

Looking at the list, if I approached it without prior knowledge gained from the captures I would have expected D3DXPreprocessShader or D3DXCompileShader to be possible culprits – shaders
could be wrongly optimized and/or compiled on AMD, but fixing that could be insanely challenging.

However, with our current knowledge one function stands out from this list – D3DXMatrixInverse is the only function that could reasonably be used to prepare pixel shader constants.

The function is called from only one place in the game:

int __thiscall InvertMatrix(void *this, int a2)
{
  D3DXMatrixInverse(a2, 0, this);
  return a2;
}

It’s… not too well made, though. A quick peek inside d3dx9_31.dll reveals that D3DXMatrixInverse does not touch the output parameters and returns nullptr
if matrix inversion fails (due to the input matrix being singular), yet the game doesn’t care about this at all. Output matrix might be left uninitialized, boo!
Inverting singular matrices does indeed happen in the game (most frequently in the main menu), but no matter what we did in an attempt to make the game handle them better
(e.g. zeroing the output or setting it to an identity matrix), visuals wouldn’t change. Oh well.

With this theory debunked, we are back to PSGP – what is PSGP doing exactly in D3DX? Rafael Rivera looked into that and the logic behind it turns out to be quite simple:

AddFunctions(x86)
if(DisablePSGP || DisableD3DXPSGP) {
  // All optimizations turned off
} else {
  if(IsProcessorFeaturePresent(PF_3DNOW_INSTRUCTIONS_AVAILABLE)) {
    if((GetFeatureFlags() & MMX) && (GetFeatureFlags() & 3DNow!)) {
      AddFunctions(amd_mmx_3dnow)
      if(GetFeatureFlags() & Amd3DNowExtensions) {
        AddFunctions(amd3dnow_amdmmx)
      }
    }
    if(GetFeatureFlags() & SSE) {
      AddFunctions(amdsse)
    }
  } else if(IsProcessorFeaturePresent(PF_XMMI64_INSTRUCTIONS_AVAILABLE /* SSE2 */)) {
    AddFunctions(intelsse2)
  } else if(IsProcessorFeaturePresent(PF_XMMI_INSTRUCTIONS_AVAILABLE /* SSE */)) {
    AddFunctions(intelsse)
  }
}

Unless PSGP is disabled, D3DX picks functions optimized to make use of a specific instruction set. That makes sense and ties back to the original theory.
As it turns out, D3DX has functions optimized for AMD and 3DNow! instruction set, so the game is indirectly making use of those after all.
With 3DNow! instructions removed, modern AMD processors take the same code path as Intel processors – that is, intelsse2.

To summarize:

  • Disabling PSGP makes both Intel and AMD take a regular x86 code path.
  • Intel CPUs always take an intelsse2 code path.
  • AMD CPUs supporting 3DNow! take a amd_mmx_3dnow or amd3dnow_amdmmx code path, while CPUs without 3DNow take an intelsse2 code path.

With this information, we put forward a hypothesis – something is possibly wrong with AMD SSE2 instructions, and the results of matrix inversion calculated on AMD with an intelsse2 path
are either too inaccurate or completely incorrect.

How do we verify that hypothesis? By tests, of course!

P.S.: You may be thinking – “well, the game uses d3dx9_31.dll but the newest D3DX9 library is d3dx9_43.dll, surely that must be fixed in later revisions??”.
We tried to verify that by “upgrading” the game to link against the newest DLL – and nothing changed.

We prepared a simple, standalone program to verify the precision of matrix inversions. During a short game session in the “bugged” game area, we recorded every input and output
of D3DXMatrixInverse to a file. Later, this file was read by a standalone test program and the results were recalculated again. To verify correctness, outputs from the game were then compared
with outputs calculated by the test program.

After several attempts basing on data collected from Intel and AMD chips and with PSGP enabled/disabled, we cross-checked the results between machines.
The results are as follows, ✔️ indicating success (results were equal) and ❌ indicating failure (results were not equal). The last column indicates whether the game handles this data fine
or if it glitches out. We deliberately did not take imprecision of floating-point maths into the account and instead compared the results with memcmp:

Data source Intel SSE2 AMD SSE2 Intel x86 AMD x86 Accepted by game?
Intel SSE2 ✔️ ✔️
AMD SSE2 ✔️
Intel x86 ✔️ ✔️ ✔️
AMD x86 ✔️ ✔️ ✔️

Tests results for D3DXMatrixInverse

Interesting – the results show that:

  • Calculations with SSE2 do not transfer across Intel and AMD machines.
  • Calculations without SSE2 do transfer across machines.
  • Calculations without SSE2 are “accepted” by the game despite not being identical to the ones from Intel SSE2.

This raises a question – what exactly is wrong with calculations from AMD SSE2 so they end up glitching the game?
We don’t have a precise answer for that, but it seems to be a product of two factors:

  • SSE2 implementation of D3DXMatrixInverse might be poor numerically – seems like some SSE2 instructions give different results on Intel/AMD (possibly different rounding modes),
    and the function is not written in a way which would help mitigate the inaccuracies.
  • The game’s code is written in a way that is too sensitive to accuracy issues.

At this point, we were ready to put forward a fix which would replace D3DXMatrixInverse with a rewrite of an x86 variation of the D3DX function and call it a day.
However, before proceeding I had one more random idea – D3DX is deprecated and got replaced with DirectXMath.
I figured that since we were to replace that matrix function anyway, I could try replacing it with XMMatrixInverse being a “modern” replacement for D3DXMatrixInverse.
XMMatrixInverse also uses SSE2 instructions so it should be equally optimal to the D3DX function, but I was nearly sure it would break the same way.

I hacked it together quickly, sent it off to Rafael and…

It works fine!?

What we were sure to be an issue coming from tiny differences in SSE2 instructions may have been a purely numerical issue after all. Despite also using SSE2, XMMatrixInverse gave perfect results
on both Intel and AMD. Therefore, we re-ran the same tests and the results were surprising, to say the least:

Data source Intel AMD Accepted by game?
Intel ✔️ ✔️ ✔️
AMD ✔️ ✔️ ✔️

Tests results for XMMatrixInverse

Not only the game works fine, but results are perfectly identical and transfer across machines!

With this in mind, we revised the theory behind that bug – without a doubt, the game is at fault for being too sensitive to issues, but with additional tests, it seems like
D3DX may have been written with fast math in mind, while DirectXMath may care more about precise calculations. This makes sense – D3DX is a product of the 2000s and it is perfectly reasonable
that it was written with performance being the main priority. DirectXMath has the “luxury” of being engineered later, so it could put more attention towards precise, deterministic computations.

It took a while to get here, so I hope you’re still not bored to death. To summarize, that’s what we went through:

  • We verified that the game does not use 3DNow! instructions directly (only the system DLLs do).
  • We found out that disabling PSGP fixes the issue on AMD processors.
  • Using PIX, we found the culprit – NaN values in pixel shader constants.
  • We nailed down the origin of those values to D3DXMatrixInverse.
  • We fuzzed that function and found out that it does not give consistent results between Intel and AMD CPUs when SSE2 instructions are used.
  • We accidentally found out that XMMatrixInverse does not have this flaw and is a viable replacement.

The only thing that’s left is to implement a proper replacement! That’s where SilentPatch for Mass Effect appears.
We have decided that the cleanest way to fix this issue is to provide a replacement d3dx9_31.dll, which forwards every function exported by Mass Effect
to the system DLL, except for D3DXMatrixInverse. For this function, we have developed a replacement using XMMatrixInverse.

A replacement DLL makes for a very clean and bulletproof installation, and it’s been confirmed to work fine with both Origin and Steam versions of the game.
It works out of the box, without the need for an ASI Loader or any other third party software.

To our best knowledge, the game now looks exactly how it should, without any downgrades in the lighting:

Noveria

Ilos

Download

The modification can be downloaded in Mods & Patches. Click here to head to the game’s page directly:

Download SilentPatch for Mass Effect
After downloading, all you need to do is to extract the archive to the game’s directory and that’s it! Not sure how to proceed? Check the Setup Instructions.


For those interested,
full source code of the mod has been published on GitHub, so it can be freely used as a point of reference:
See source on GitHub

Tired of note-taking apps

I’m tired of note-taking apps.

It’s not because of limited choices. But it’s the other way around. There are so many note-taking apps you could try but end up sticking to none. At least, that’s my story. It’s a perfect example of the paradox of choice.

I used to wonder why people keep building so many ‘note-taking’ apps when the market is already crowded with choices. Then I figured a few reasons why.

  • the market size: the global note-taking management software market is estimated to reach $1.35 billion by 2026, growing at a CAGR of 5.32% from 2019 to 2026
  • greater scope for innovation: eg., be it creating a task list, a roadmap, or a design repository, Notion can handle it all
  • lack of satisfaction: it’s noted that people always use a combination of note-taking apps and hardly stick to one for a long time

Despite such heavy competition, apps like Notion, Google Keep, OneNote, Evernote, etc. have managed to earn a place. People use these apps for

  • the ecosystem. eg., Google Keep, Microsoft OneNote
  • the neat user experience., eg. Bear etc.
  • creating a disciplined way of taking notes. eg., Notion, Roam Research

I’ve tried them all. But none of these apps have turned me into a ‘repeat user.’

After battling with so many apps only to feel guilty for not having the discipline to consistently use them, I’ve finally resorted to the most personal and easy alternative ⁠— writing things down.

I’m familiar with writing in a notebook since my childhood. It’s not new to me, and it absolutely doesn’t require any learning curve.

The reasons why I find writing things down useful

  • absolute focus and the ability to think through the points I’m writing
  • gives a chance to remember what I’m writing
  • no way to copy-paste stuff as it is, and that means taking notes in a way I understand
  • easy to switch between formats eg., flowchart, mind map, Venn diagram, etc
  • helps me stay in touch with my handwriting

Of course, everything has its downsides, and writing things down is no exception here.
For example, I will not be able to

  • add screenshots/images, links, etc
  • easily search for content as there’s no ‘search bar’

And maybe there’s more to the list I’m not talking about.

All I can say for sure is, based on my usage behavior, I’m okay missing out on these features. I can always save links to Pocket for future reference, and take pictures of my notes to share with friends.

So if you ask me if I’d try a beautiful, innovate note-taking app that’s much better than the apps I’ve used so far, my answer is, Why not! I’d definitely give it a shot.”

But my greatest worry is if I’d continue using it.

Note: If my opinion on note-taking apps changes over time, I’d be happy to update this post with a “And the hero finally arrived!” heading to talk about the app that helped change my mind. 🤡