Stability Heuristic

Skip to end of metadata
Go to start of metadata

Introduction

Definition of RO Stability.

This heurtistic is aimed at detecting decay in workflow and workflow components.

Stability is one of the IQ dimensions that let us analyze some aspects of a RO or its resources. Stability is a degree of cohering between resources of a RO; firmness of a whole RO based on the adherence of its parts.

When working with RO’s, we may want to know if something important has been removed, if the RO is still stable after some changes,  if a workflow is already finished or fits with the other parts of the RO,etc. By using provenance we are able to trace the evolution of a RO along time and determine its stability after changes in its resources.

This measurement allows us to obtain a degradation or decaying value of the workflow in order to make a decision about its execution, recommendation of other similar workflows, etc.

Prototype demo

First implemented version of the Stability heuristic (I&A) and a very basic application where you can create your on traces and evaluate them.
For now it uses only 3 params: User, Action and Impact of the action. (I'll include Time and a relation with the MIM in next versions).

Remember that the values used in the heuristic are intuitive and what I really need are different scenarios with known stability (three scenarios provided by the Users, with expected stability values of 100%, 50% and 0%) so I can adapt/refine the current heuristic. We would like to be able to validate the prototype in one/two weeks so we can conclude this IQ dimension before Christmas.

Would appreciate if you find time to try it and give me some feedback and scenarios with the stability you would like to get (instead of the stability generated by the heuristic).

Download: Consistency.rar (executable .jar and a Readme. Please let me know if you have any doubts or problems)

Screenshots

This is a small application whose intention is to allow users evaluate the stability heuristic. The heuristic is based on params and values that should be refined by USERS feedback.

Possible params

There are some aspects that could exert influence on a RO or on resources of the RO. Here you can find a list of them (this list may not be complete).

  • Who:
    • The person that is working on the RO. Is he the leader of the research or a doubtful collaborator?
  • Which:
    • Which are the modifications that this person has been doing on the resources? Is he adding information, creating resources, removing elements, etc.?
  • How:
    • How do the modifications affect the RO? Are they unsubstantial changes (e.g. the name of the workflow) or do they have a big impact on the resource (e.g. modifying the 80% of the workflow structure)?
  • What:
    • What is changing on the RO? Does a change have the same impact on the stability of a RO if it has been done on a .doc than it has been done on a workflow?
  • When:
    • When are those changes done? Are we still working on the RO daily or is the RO stable and we are doing maintenance operations once a year?
  • Any others??

Who

As we know, it is important to know who is responsible of the changes. We have defined some representative roles in order to identify different kind of users: 

  • Creator, Leader:
    • Creator of the RO, leads the research.
  • Trusted collaborator:
    • Collaborator who has been involved in several researches with the leader or with many good references.
  • Regular collaborator:
    • Mate department or similar, there are no reasons for entrusting his work.
  • Doubtful collaborator:
    • External collaborator or with questionable references.

Which

We need to identify which kind of modifications can be done on a resource or on elements of the RO. For instance:

  • Create:
    • Creating a new empty resource (document, workflow, data, etc.)
  • Edit (Add):
    • Adding information to a resource. Working on a resource.
  • Edit (Remove):
    • Removing part of a resource, decreasing its content.
  • Delete:
    • Deleting a resource of the RO.

Note that generally a “Remove action” is a bit more delicate than an “Add action”. Probably a “Remove action” may have more consequences that an “Add action”. But it also exist the possibility that something is removed because it has to be removed (or an adding on a resource that corrupts the RO and decreases stability).

Relations between users and actions

It is possible to establish relations between the different kind of users and the actions that they’re able to perform. The next matrix contains different weights that represent those relations. This is only a representation of the idea; the values are subject to change.

  Create Edit (add) Edit (remove) Delete
Owner, Leader 55% 2 1 0
Trusted collaborator 50% 1 0 0
Regular collaborator 45% 0 -1 0
Doubtful collaborator 35% -1 -2 0

It is necessary to know that we are going to measure stability by using percentages. Taking a brief look to the previous matrix we can see that, for instance, when the owner of a RO creates a new element it starts with 55% stability, but if a doubtful collaborator does so it has 35% stability.

In addition, to understand the values which are in the Edit columns we might use another concept; the “How”.

How

It seems important to know the impact of the changes that have been done on a RO resource. As it has been explained before, there’s no point between changing the name of an archive and modifying the structure of a workflow. So we have delimited three groups of changes based on its impact.

  • Minimum:
    • An unsubstantial change on a resource.
  • Regular:
    • Relevant changes on a resource.
  • Maximum:
    • Big changes on a resource.

We all know that changing a simple line of code can have an important impact on a program. But we are interested in stability, and it is much more possible that a transcendental change on a resource affects to the whole stability of a resource. In order to do so we have ascribed a percentage to each “changing impact”:

Min 1%
Regular 3%
Max 5%

Now, by combining the User-Action matrix and the measures of impact we are willing to evaluate the evolution of a RO.

The “Edit” columns of the User-Action matrix (weights of trust) will be multiplied by the impact of the action.

  • For instance, if a owner creates an element it starts with 55% of stability, and if he edits and adds something to that element and it has regular impact, the stability of the element would be:
    • 55 plus the 6%(2*3%) of 55 -> 58,3% of stability
  • Now a doubtful  collaborator edits (by adding) the same element with minimum, the result would be:
    • 58,3 minus the 1% (-1*1%) of 58,3 -> 57,72% of stability
  • And so on…

Examples

We have emulated a trace for three elements of a RO. The three elements have the same actions in the same order, but they are done by different kind of users:

The first trace is entirely done by the owner of the RO.

Element1
O Create 55
O Add Reg 58,3
O Add Min 59,5
O Add Max 65,4
O Remove Min 66,1
O Add Reg 70
O remove Min 70,7
O Remove Reg 72,9
O Add Min 74,3
O Remove Min 75,1
O Add Min 76,6

And here we can see how its stability evolves in time:

The second trace shows the evolution of an element developed by a owner and a trusted collaborator:

Element 2
T Create 50,00
T Add Reg 51,50
O Add Min 52,53
T Add Max 55,16
O Remove Min 55,71
O Add Reg 59,05
T remove Min 59,05
O Remove Reg 60,82
T Add Min 61,43
O Remove Min 62,04
T Add Min 62,67

And the evolution of the element 2 in time is:

The third trace represents an element submitted to changes by the owner, a trusted collaborator and a doubtful collaborator:

Element 3
T Create 50,00
O Add Reg 53,00
D Add Min 52,47
D Add Max 49,85
T Remove Min 49,85
O Add Reg 52,84
T remove Min 52,84
D Remove Reg 49,67
T Add Min 50,16
D Remove Min 49,16
O Add Min 50,14

 And its trace:

If we put the three traces together we can see how the first one and the second one have a increasing stability (the second one is not so pronounced as the first one). On the other hand, the third one has a combination of increases and decreases, with a final stability barely identical to the initial stability.

Stability of the RO

We have explained above how to measure the stability of each element of the RO. If we want the consistency of the whole RO we need to calculate the average of the consistency of its elements.

If we calculate the RO stability based on our last example we obtain this evolution along time:

Avarage
Step1 51,7
Step2 54,3
Step3 54,8
Step4 56,8
Step5 57,21
Step6 60,64
Step7 60,9
Step8 61,1
Step9 62
Step10 62,1
Step11 63,1% Stability


When

It sounds reasonable to involve the frequency of changes on a RO in our analysis of stability. When a RO is subject of several changes in a short period of time may reflect that has a questionable stability. Researchers are still working and pulling out their RO. On the other hand, a solid RO with high consistency shouldn’t be subject of many modifications; only the essential ones to keep it valid and alive.

Idea (needs to be reinforced): To multiply the calculated stability without caring of frequency by a weight that could reduce the stability on high frequencies.

For instance: 

Frequency value
Daily 0,9
Weekly 0,95
Monthly 1
  • If the example which we are working with has a daily frequency of changes, the result should be:
    • 63,1 (stability) * 0,9 (Daily) = 56,79% of stability

What

Not all the resources of a RO have the same importance. We should be able to identify which kind of resources (Workflows, data, docs, etc.) are the main structure of the research and play down importance to the others.

Idea (needs to be reinforced): We could set a distribution of importance depending on kind of which kind of resources are the centre of the research.

e.g. 

Elements distribution
Docu 25%
Data 35%
WF 40%

 
According to the example, if we are told that the element 1 is data, element 2 is a WF and element 3 is docu; and the weights of the different resources are those shown in the table above, we can say that the stabilityof our RO is:

  • 0,35(data)*76,6(element1)+0,4(WF)*69,61(element2)+0,25(docu)*50,14(element3)= 26,81 + 27,84 + 12,54 = 67,185% Stability
Enter labels to add to this page:
Please wait 
Looking for a label? Just start typing.