Llucas Sprint02
These are the sprint archives/summaries of Anthony Beck. Anthony is a part time (60% Full Time Equivalence (FTE)) researcher on the Leverhulme Trust funded project Sustaining Urban Habitats - an interdisciplinary approach led by Prof Darren Robinson.
These documents represent an archive of the work undertaken by Ant.
Unless stated otherwise these documents are released under a CC0 (+BY). This means that it is essentially in the public domain and you can re-use it in an unfettered way. I would, however, appreciate, but not mandate, a reference.
Sprint 02 16th Feb - 27th Feb 2015
A general orientation sprint.
This sprint has the following sub-tasks:
- Develop short-term plan for me
- Described in /home/arb/ownCloud/Nottingham/LLUCAS/Sprints/Sprint01/ARB_ProofOfConceptProject.md
- Infrastructure - software environment, processing and repository
- Other longer term data repository issues Darren is involved in
- Look at data infrastructure - Chris Parry
- Progress - meeting Chris next week
- Invited to the Management group (Library/It?research) to give a presentation
- Look at OS AddressBase data
- Pass over CMS details to Jemma
- Actually think this should be:
- David Aldred
- Web Manager
- 0115 84 67476
- david.aldred@nottingham.ac.uk
- Tom Wright
- Digital Engagement Manager
- 0115 95 13309
- tom.wright@nottingham.ac.uk
- Mr Andy Beggan
- Associate Director, LRLR Learning Technology
- 0115 84 67707
- andy.beggan@nottingham.ac.uk
- David Aldred
- Slideshare
- Scribd
- etc.
- Actually think this should be:
- Develop a data requirements/needs document that will act as a template across all the projects
- Get in touch with Data Analytics people
- Didier - meeting arranged
- ADAC
- Specifically John Garibaldi - prof in Infoatics - who does semantic reasoning and decision support :-)
- Background reading
Tasks rolled over for the next sprint
- contact David Holland and Isabel Sargent @ OSGB.
- Mention my name and ask if they can discuss what has been happening with their 3D city models research. I know Bournemouth was well covered in research (10 years ago ish) but it would be great to find out if/where this has been rolled out to. Also if they have plans could we ask they consider Nottingham etc. I have not kept up with what has been happening in Photogrammetric Services at OS (and what LiDAR they have been looking at) and so would be good to know.
- Contact CMS people (if below don;t work then Jemma) look at SS rendering
- Actually think this should be:
- David Aldred
- Web Manager
- 0115 84 67476
- david.aldred@nottingham.ac.uk
- Tom Wright
- Digital Engagement Manager
- 0115 95 13309
- tom.wright@nottingham.ac.uk
- Mr Andy Beggan
- Associate Director, LRLR Learning Technology
- 0115 84 67707
- andy.beggan@nottingham.ac.uk
- David Aldred
- Actually think this should be:
- Develop a data requirements/needs document that will act as a template across all the projects
- Get in touch with Data Analytics people
- Didier - meeting arranged
- ADAC
- Specifically John Garibaldi - prof in Infoatics - who does semantic reasoning and decision support :-)
- James Goulding
Retrospective
A general review of thoughts/issues etc that summarise outcomes from the sprint and to take forward to the next sprint.
What went well?
- Approval given by Darren for MSc student project
- Good progress with the research data infrastructure team - Chris Parry and Christine Middleton
- I got bumped up on the CMS course and attended the course on 20150223
- Meeting with Didier next week
What have we learnt?
- The University infrastructure may be able to allows us to do all we want (and more)
- I am building up a good network within the university that deals with big data.
What could we do better?
What still puzzles us?
What is blocking activity
Task summary
Infrastructure
Caveat - we could take everything outside of university infrastructure. I have made the assumption we want to do everything in collaboration with the university. If there are too many compromises then we can review this.
Data and server environments - this would be best co-ordinated with the virtualisation farm. This is managed by Chris Parry (who is head of partnering and has responsibility for the university data repository: chris.parry@nottingham.ac.uk - +44 115 84 13196 mob 07788144718). A meeting will be convened with Chris.
Internal Workflow environments. Jeremy said to get in touch with Didier Libervichy. He is a spatial statistician working on quality and uncertainty issues in workflows.
James Goulding who works at the Horizon Digital Economy Research - lots of very cool and relevant stuff. His Neo deomographics work has great potential (more techy crowd sourcing - Smart Meter visualizations etc.)
Jon Garibaldi is part of the Uni Team - will arrange to meet him once I've jumped through the other hoops.
Chat with Chris Parry and Christine Middleton - 20150217
Really good chat with both of these. Essentially the university is taking the data element very seriously (RCUK mandate) and has a triumvirate working on the problem (library, IT infrastructure and research support). They will be rolling out the infrastructure over the next few months and then working on providing support material for researchers.
I have suggested that they take us on as a 'case study'/exemplar. There are a number of reasons behind this:
- I can add value to what they are doing (especially on licences)
- They need some research vision injecting into what they are doing (otherwise it may become a service without a focus)
- They would get to profile a project through the full lifecycle
- We get to shape the nature of the resource - potentially critical for the data analytics elements.
- We get a decent amount of hand-holding - it's in everyones interest to make it work!
I'm meeting with Chris at 12:30 on Wednesday 25th.
Christine would like me to meet with the Management team for the project.
Basically lots of excitement about building in analytics, getting an engaged team on board and doing things properly..... which was nice!
The state of the infrastructure is as follows:
Data repositories
They are taking a DSPACE approach and have back-end storage in place. This should be rolled out in the next 6-8 weeks. If our data archive requirements are in the order of a few terrabytes this should be free.
Metadata for the repository - should be Dublin Core focussed - may be Linkied Open Data - still fluid.
Every? research object will get a DOI minted - this is excellent. The capability is not yet in place but will be.
They will throw a lot of resource at bulk ingesting data. This is a good thing too.
Licencing - this is still fluid: RCUK recommends a default CC-BY licence. The university is considering a default CC-BY-NC licence: I think this is a very bad idea - if you want to know why I will tell you. It's definitely a bad idea for this project.
Much of the thinking here is concerned with publications and 'Open Access' - I'd like them to start thinking in terms of data.
Collaborative environments
The confluence workplace collaborative tool will be phased out over the next 6 - 18 months and replaced with sharepoint. This will provide document management and versioning. It's probably best if we work with sharepoint ASAP - so people don't have to use two packages.
All the university domains will be migrated to the office 365 environment (not sure when but I assume c. 3-4 months). This will allow multi-user editing in the same document. When tied in with sharepoint this will also provide versioning.
This means that 'one drive' will be the back-end data store. This provides 'dropbox' like sharing facilities. This will be available c. 3-4 months.
Data Analytics
This is less well scoped. The data repositories element lets us put our stuff somewhere and describe it but doesn't really act as an analysis point. We need to expose these data in a manner which allows them to be pre-processed and ultimately exposed to the modelling tools. This should be web-orchestrated (potentially using a polyglot approach: R, python, sql in addition to some geoservices). We'll need some back-end architecture - Postgres/Oracle, some web-native processing tools (Alteryx, geokettle, etc) and possibly an orchestration environment (taverna.....). Getting all this to work may be hard - at Leeds IT services were not very supportive and blocked pretty much everything. However, the landscape has changed dramatically which was demonstrated by the conversations I had with Chris and Christine today. I think there's a good chance we can get this to work. We've been invited into to talk to the right people who already share a similar vision.
I am meeting Didier next week and will try and arrange a meeting with Jon Garibaldi - but it might be more appropriate to wait a while to see what happens when I meet the team.
Useful contacts
Alteryx
London Regional Office
359 Goswell Road London EC1V 7JL United Kingdom Phone: +44 020 3002 4836 - [map]
Modelling
We need data that will support the following modelling exercises (which in turn go through the simulations environments)
- Model
- Building
- Energy demands
- Multi modal transport
- 4 stage modelling
- 6 stage modelling
- District heating networks
- Demands from LV networks
- Demands from storage
- Model land-user transport interactions
Notes about the different general data categories
- Data differences
- Calibration data
- Evaluation data
- Sensor data
- In terms of sensoring results
- Data agencies
- Traffic count
- Ofgem
Other activities TBC
- Speak to the photogrammetry dept in Stuttgart
- Developing an ADE extension to CityGML for Stuttgart
- Doreen - Speak to Geomatics group
- Kyle
Background reading
See LLUCAS_ReadingList.md and associated bibliogrpahy