Have you considered the dangers of using any data-driven approach? By definition they can only quantify SAMENESS - what has already been accomplished. This is only half the story. Even if that sameness is really desirable, surely it is the UNIQUE qualities of each city which make its character and, therefore, its value amidst the cities of our planet.
Using data is appealing, and would be fine if we were not concerned with living systems. To the extent a city has an identity its inhabitants accept and encourage, I believe a city is a living system ("Life" here, is defined as the system's continuing capability to be and become itself.) Putting a city into a category with others whose DESCRIPTIONS are similar, tells us about what the city produces (its mechanism) but nothing about what it is (its organism).
Cities are NOT replicable. Each one is inherently a one-off. (Life is not replicable either, which is why no scientist can ever say what life IS. Replicability is what makes scientific truth and the scientific method so valuable.) If we embrace a science-based (data driven) approach, we are embracing science's conundrum, also.
Science cannot recognise life as a valid phenomena because it cannot study any phenomenon that is NEVER the same as itself nor anything else. Cities have the same problem. Science can only study classes and categories: things that are shared, non-unique. Being alive implies a certain unboundedness, unpredictability. When something is ALIVE it has the potential to change, autonomously redefining itself. Life cannot sit neatly inside categories. It tends to spontaneously make new ones. That's one of the most interesting aspects of cities, too.
As it cannot be classified or replicated, a city (or even the creative, living people that comprise it) cannot be considered as a significant influence upon any phenomenon found in the data. The LIFE of the city or the people within it are not available to a scientific approach. So if you wish to use the collected data, make sure it is not to prove anything, back an argument or test an hypothesis. Unless, of course, you are happy to throw the city's life out of the window too, which I doubt.
However, all is not lost. Historians. for instance, use data quite differently and their studies are very applicable to living, unique systems like cities. Perhaps we should look there instead for suitable models and approaches to this interesting question?